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Antigens and Their Detection 

TECHNICAL FIELD 

[001] The invention relates to novel nucleotide 
sequences located in a gene which encodes a bacterial 
flagellin antigen, and the use of those nucleotide 
sequences for the detection of bacteria which express 
particular ' flagellin antigens, on the basis of that 
antigen alone, or in conjunction with the O antigen 
expressed by that strain. 

BACKGROUND ART 

[002] The flagellum of many bacteria appears to be 
made up of a single protein known as flagellin. The 
serotyping schemes of E, coli and Salmonella enterics are 
based on highly variable antigenic surface structures 
which include the lipopolysaccharide which carries the O 
antigen and flagellin which is now known to be the carrier 
of the classical H antigen. In many strains of 5. enterica 
there are two loci (fliC and fljB) which encode flagellin, 
and a regulatory system which allows one only to be 
expressed at any time; and which also provides for 
expression to rapidly alternate between the two forms 
first identified as two phases (HI and H2) for the H 
antigen of most strains. In E, coli there are 54 forms of 
H antigen recognised and until recently they were all 
thought to be encoded at the fliC locus, as has been shown 
for E. coli K-12. However in the 1980s Ratiner [Ratiner Y 
A "Phase variation of the H antigen in Escherichia coli 
strain Bi327-41, the standard strain for Escherichia coli 
flagellin antigen H3" FEMS Microbiol. Lett 15 (1982) 33- 
36; Ratiner Y A "Presence of two structural genes 
determining antigenically different phase-specific 
flagellins in some Escherichia coli strains" FEMS 
Microbiol. Lett. 19 (1983) 37-41; Ratiner Y A "Two genetic 
arrangements determining flagellin antigen specificities 
in two diphasic Escherichia coli strains" FEMS Microbiol. 
Lett. 29 (1985) 317-323; Ratiner Y A "Different alleles of 
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the flagellin gene hagB in Escherichia coli standard H 
test strains" FEMS Microbiol Lett. 48 (1987) 97-104] 
showed that in some cases there are two loci and that 
expression can alternate. The matter was further 
complicated by a recent paper by Ratiner [Ratiner Y A 
(1998) "New f lagellin-specif ying genes in some Escherichia 
coli strains" J. Bacterid. 180 979-984] showing three 
loci {flkr fll and flm) for flagellin in addition to fliC 
although the fljB locus has not been found in E. coli. 
However E. coli strains are normally identified by the 
combination of one O antigen and one H antigen [and K 
antigen when present as a capsule (K) antigen] , with no 
problems reported for the vast majority of cases with 
alternate phases, while 5. enterica strains are normally 
identified by the combination of O, HI and H2 antigens. It 
is still not clear how widespread in E, coli H antigens 
determined by flagellin genes other than fliC are. 

[003] Typing is typically carried out using 
specific antisera. The incidence of pathogenic E. coli in 
association with human and animal disease supports the 
need for suitable and rapid typing techniques. 

DESCRIPTION OF THE INVENTION 

[004] In a first aspect, the present invention 

provides a novel nucleic acid molecule encoding all or 

part of an E, coli flagellin protein. 

[005] The present invention provides, for the 
first time, full length sequence for a flagellin gene for 
the following E. coli type strains: H6 (SEQ ID NO: 8), 
H9(SEQ ID NOill), H10(SEQ ID N0:12), H14 (SEQ ID N0:15), 
H18(SEQ ID N0:18), H23 (SEQ ID NO:22), H51 (SEQ ID NO:50), 
H45(SEQ ID NO:43), H49 (SEQ ID NO:48), H19(SEQ ID N0:19), 
H30(SEQ ID NO:29), H32 (SEQ ID N0:31), H26(SEQ ID NO:25), 
H41(SEQ ID NO:39), H15 (SEQ ID N0:16), H20(SEQ ID NO:20), 
H28(SEQ ID NO:27), H46(SEQ ID NO:44), H31 (SEQ ID NO:30), 
H34(SEQ ID NO:33), H43 (SEQ ID N0:41) and H52 (SEQ ID 
NO: 51). Corrected full length sequences have been 
obtained for H7 (SEQ ID NO: 9) and H12(SEQ ID NO: 14) type 
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strains . 

[006] Partial flagellin gene sequence, including 
the central variable region, has been obtained for the 
following E. coli H type strains: H40 (SEQ ID NO: 38), 
H8(SEQ ID N0:10), H21(SEQ ID NO:21), H47 (SEQ ID NO:46), 
HIKSEQ ID N0:13), H17 (SEQ ID N0:17), H25 (SEQ ID NO:24), 
H42(SEQ ID NO:40), H27 (SEQ ID NO:26), H35 (SEQ ID NO:34), 
H2(SEQ ID NO: 67), H3(SEQ ID NO: 68), H24(SEQ ID NO:23), 
H37(SEQ ID NO:35), H50(SEQ ID NO: 49), H4 (SEQ ID NO: 6), 
H44(SEQ ID NO:42), H38(SEQ ID NO:36), H39(SEQ ID NO:37), 
H55(SEQ ID NO:53), H29(SEQ ID NO:28), H33(SEQ ID NO:32), 
H5(SEQ ID N0:7), H54 (SEQ ID NO: 52) and H56(SEQ ID NO: 54). 

[007] Comparison of sequences demonstrates that 
unique flagellin genes have now been sequenced (partially 
or completely) for the following E. coll H type strains: 
HI, H2, H3, H5, H6, H7 , H9, Hll, H12, H14, H15, HIS, H19, 
H20, H21, H23, H24, H25, H26, H27, H28, H29, H30, H31, 
H32, H33, H34, H35, H37, H38, H39, H41, H42, H43, H45, 
H46, H48, H49, H51, H52, H54, and H56 and either H8 or 
H40, HIO or H50 and H4 or H17. 

[008] By comparison of these sequences, the 
present inventors were able to identify specific sequences 
for each of the above H serotypes . 

[009] The present invention also provides fllC 
sequences from 10 different H7 strains, in addition to 
that from the H7 type strain, and two sequences specific 
to H7 of 0157 and 055 E. coll strains. 

[0010] The present invention encompasses all or 
part of the flagellin genes sequenced for H2, H3, H5, H6, 
H9, Hll, H14, H18, H19, H20, H21, H23, H24, H25, H26, H27, 
H28, H29, H30, H31, H32, H33, H34, H35, H37, H38, H39, 
H41, H42, H43, H44, H45, H46, H47, H48, H49, H51, H52, 
H54, H55, H56, H8, H40, H15, HIO, or H50, H4 and H17 type 
strains. Of these flagellin genes sequenced, those from 
the type strains for H8 and H40 are identical, those from 
type strains HIO and H50, HI and H12, H38 and H55, H21 and 
H47, and H4, H17 and H44 type strains are highly similar. 
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[0011] The invention also encompasses newly 
provided sequence for H7 and H12 as well as novel primers 
for the specific amplification of HI, H7, H12 and H48 as 
well as for the other above mentioned newly sequenced 
flagellin genes. 

[0012] By cloning and expression of these 
sequenced flagellin genes in a fliC deletion E. coll K-12 
strain, and use of anti-H antiserum, we have confirmed the 
H specificities encoded by 39 f alq e llin f laqellin genes . 
The 39 H specificities are HI, H2, H4, H5, H6, H7, H9, 
HIO, Hll, H12, H14, H15, H16, H18, H19, H20, H21, H23, 
H24, H26, H27, H28, H29, H30, H31, H32, H33, H34, H38, 
H39, H41, H42, H43, H45, H46, H49, H51, H52, and H56, 
encoded by flagellin genes obtained from H type strains 
for HI, H2, H4, H5, H6, H7, H9, HIO, Hll, H12, H14, H15, 
H^H16, H18, H19, H20, H21, H23, H24, H26, H27, H28, H29, 
H30, H31, H32, H33, H34, K3B, K39, H41, H42, n43, H45, 
H46, H49, H51, H52, and H56 respectively. 

[0013] The nucleic acid molecules of the invention 
may be variable in length. In one embodiment they are 
oligonucleotides of from about 10 to about 20 nucleotides 
in length. The oligonucleotides of the invention are 
specific for the flagellin gene from which they are 
derived and are derived from the central region of the 
gene. In one embodiment, oligonucleotides in accordance 
with the present invention^, which also include 
oligonucleotides from the previously sequenced E, coll HI, 
H7, H12 and H48 genes, are those shown in Table 3. 

[0014] The 45 sequences (see Table 3) provide a 
panel to which newly sequenced genes can be compared to 
select specific oligonucleotides for those newly sequenced 
genes . 

[0015] In a second aspect the invention provides a 
method of detecting the presence of E. coll of a 
particular H serotype in a sample, the method comprising 
the step of specifically hybridising at least one nucleic 
acid molecule derived from a flagellin gene, wherein the 
at least one nucleic acid molecule is specific for a 
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particular flagellin gene associated with the H serotype, 
to any E. coll in the sample which contain the gene, and 
detecting any specifically hybridised nucleic acid 
molecules, wherein the presence of specifically hybridised 
nucleic acid molecules identifies the presence of the H 
serotype in the sample. 

[0016] In one preferred embodiment the detection 
method is a Southern blot method- More preferably, the 
nucleic acid molecule is labelled and hybridisation of the 
nucleic acid molecule is detected by autoradiography or 
detection of fluorescence. 

[0017] Preferred nucleic acid molecules for the 
detection of particular flagellin genes are listed in 
Table 3, 

[0018] In a third aspect the invention provides a 
method of detecting the presence of E. coll of a 
particular H serotype in a sample, the method comprising 
the step of specifically hybridising at least one pair of 
nucleic acid molecules to any E, coli in the sample which 
contains the flagellin gene for the particular H serotype, 
wherein at least one of the nucleic acid molecules is 
specific for the particular flagellin gene associated with 
the H serotype, and detecting any specifically hybridised 
nucleic acid molecules, wherein the presence of 
specifically hybridised nucleic acid molecules identifies 
the presence of the H serotype in the sample . 

[0019] In one preferred embodiment the detection 
method is a polymerase chain reaction method. More 
preferably, the nucleic acid molecules are labelled and 
hybridisation of the nucleic acid molecule is detected by 
electrophoresis . 

[0020] It is recognised that there may be 
instances where spurious hybridisation will arise through 
the initial selection of a sequence found in many 
different genes but this is typically recognisable by, for 
instance, comparison of band sizes against controls in PGR 
gels, and an alternative sequence can be selected. 

[0021] In a fourth aspect the invention provides a 
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method for detecting the presence of a particular 0 
serotype and H serotype of E. coii in a sample, the method 
comprising the following steps: 

(a) specifically hybridising at least one nucleic 
acid molecule, derived from and specific for a gene 
encoding a transferase or a gene encoding an enzyme for 
the transport or processing of a polysaccharide or 
oligosaccharide unit, the gene being involved in the 
synthesis of a particular E. coli O antigen, to any E. 
coll in the sample which contain the gene; 

(b) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a particular 
flagellin gene associated with that H serotype, to any E. 
coli in the sample which contain the gene; and 

(c) detecting any specifically hybridised nucleic 
acid molecules. 

[0022] Preferred nucleic acid molecules for the 
detection of particular flagellin genes are listed in 
Table 3. 

[0023] In one preferred embodiment, the sequence 
of the nucleic acid molecule specific for the 0 antigen is 
specific to the nucleotide sequence encoding the Olll 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdH 
(nucleotide position 739 to 1932 of Figure 5) , wzx 
(nucleotide position 8646 to 9911 of Figure 5) , wzy 
(nucleotide position 9901 to 10953 of Figure 5) , whdM 
(nucleotide position 11821 to 12945 of Figure 5) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 8 and 8A, with respect to the 
above mentioned genes. 

[0024] In another preferred embodiment, the 
sequence of the nucleic acid molecule specific for the O 
antigen is specific to the nucleotide sequence encoding 
the 0157 antigen. More preferably, the sequence is 
derived from a gene selected from the group consisting of 
wbdN (nucleotide position 79 to 861 of Figure 6} , wbdO 
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(nucleotide position 2011 to 2757 of Figure 6) , wbdP 
(nucleotide position 5257 to 6471 of Figure 6) , wbdR 
(nucleotide position 13156 to 13821 of Figure 6) , wzx 
(nucleotide position 2744 to 4135 of Figure 6) and wzy 
(nucleotide position 858 to 2042 of Figure 6) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 9 and 9A, with respect to the 
above mentioned genes. 

[0025] In one preferred embodiment the detection 
method is a Southern blot method. More preferably, the 
nucleic acid molecule is labelled and hybridisation of the 
nucleic acid molecule is detected by autoradiography or 
detection of fluorescence. 

[0026] In a fifth aspect the invention provides a 
method for detecting the presence of a particular O 
serotype and K serotype of E. coll in a sample, the method 
comprising the following steps: 

(a) specifically hybridising at least one pair 
of nucleic acid molecules, at least one of which is 
derived from and specific for a gene encoding a 
transferase or a gene encoding an enzyme for the transport 
or processing of a polysaccharide or oligosaccharide unit, 
the gene being involved in the synthesis of the particular 
E. coll O antigen, to any E. coll in the sample which 
contain the gene; 

(b) specifically hybridising at least one pair 
of nucleic acid molecules, at least one of which is 
derived from and specific for a particular flagellin gene 
associated with the particular H serotype, to any E. coll 
in the sample which contain the gene; and 

(c) detecting any specifically hybridised 
nucleic acid molecules. 

[0027] Preferred nucleic acid molecules for the 
detection of particular flagellin genes are listed in 
Table 3. 

[0028] In one preferred embodiment, the sequence 
of the nucleic acid molecule specific for the O antigen is 
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specific to the nucleotide sequence encoding the 0111 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdH 
(nucleotide position 739 to 1932 of Figure 5) , wzx 
(nucleotide position 8646 to 9911 of Figure 5), wzy 
(nucleotide position 9901 to 10953 of Figure 5) , whdM 
(nucleotide position 11821 to 12945 of Figure 5) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 8 and 8A, with respect to the 
above mentioned genes. 

[0029] In another preferred embodiment, the 
sequence of the nucleic acid molecule specific for the O 
antigen is specific to the nucleotide sequence encoding 
the 0157 antigen. More preferably, the sequence is 
derived from a gene selected from the group consisting of 
wbciN (nucleotide position 7 9 to 8 61 of Figure 6) , whdO 
(nucleotide position 2011 to 2757 of Figure 6) , wbdP 
(nucleotide position 5257 to 6471 of Figure 6) , whdR 
(nucleotide position 13156 to 13821 of Figure 6) , wzx 
(nucleotide position 2744 to 4135 of Figure 6) and wzy 
(nucleotide position 858 to 2042 of Figure 6) and 
fragments of those molecules of at least 10-12 nucleotides 
in length- Particularly preferred nucleic acid molecules 
are those set out in Tables 9 and 9A, with respect to the 
above mentioned genes. 

[0030] In one preferred embodiment the detection 
method is a polymerase chain reaction method. More 
preferably, the nucleic acid molecules are labelled and 
hybridisation of the nucleic acid molecule is detected by 
electrophoresis . 

[0031] The present inventors believe that based on 
the teachings of the present invention and available 
information concerning 0 antigen gene clusters, and 
through use of experimental analysis, comparison of 
nucleic acid sequences or predicted protein structures, 
nucleic acid molecules in accordance with the invention 
can be readily derived for any particular O antigen of 
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interest. Suitable bacterial strains can typically be 
acquired coinxnercially from depositary institutions. 

[0032] There are currently 166 defined E. coll O 
antigens . 

[0033] Samples of the 166 different coli O 
antigen serotypes are available from Statens Serum 
Institute Copenhagen, Denmark. 

[0034] The inventors envisage rare circumstances 
whereby two genetically similar gene clusters encoding 
serologically different O antigens have arisen through 
recombination of genes or mutation so as to generate 
polymorphic variants. In these circumstances multiple 
pairs of oligonucleotides may be selected to provide 
hybridisation to the specific combination of genes. The 
invention thus envisages the use of a panel containing 
multiple nucleic acid molecules for use in the method of 
testing for O antigen in conjunction with H antigexi;- 
wherein the nucleic acid molecules are derived from genes 
encoding transferases and/or enzymes for the transport or 
processing of a polysaccharide or oligosaccharide unit 
including wzx or wzy genes, wherein the panel of nucleic 
acid molecules is specific to a particular O antigen. The 
panel of nucleic acid molecules can include nucleic acid 
molecules derived from O antigen sugar pathway genes where 
necessary. 

[0035] The inventors also found two mutated 
flagellin genes from H type strains for H35 and H54 which 
have insertion sequences inserted into normal flagellar 
genes identical or near identical to that that of the Hll 
and H21 type strains respectively. Thus, primers for Hll 
and H21 (listed in Table 3) would also amplify fragments 
in H35 and H54, which differ in sizes to those in Hll and 
H21 respectively. The inventors also provide two pairs of 
primers each for H35 and H54 based on the insertion 
sequence (see H35 and H54 columns in Table 3) . The use of 
one of them in combination with one of the Hll or H21 
primers will generate a PGR band only in H35 or H54 
respectively, and this will also differentiate H35 and H54 
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from Hll and H21 respectively. 

[0036] The present invention also relates to 
methods of detecting the presence of particular coli H 
antigens or H antigen and 0 antigen combinations where one 
or more nucleic acid molecules which generate a particular 
size fragment indicative of the presence of that H antigen 
are used or in which the combination of one antigen 
specific primer for that H antigen with another primer for 
a related H antigen provides for the detection of the 
particular H antigen by hybridisation to the relevant 
gene. Preferably, the H antigen is Hll, H21, H35 or H54 . 

[0037] The pairs of nucleic acid molecules where 
the method of the fifth aspect is used may both hybridise 
to the relevant H or O antigen gene or alternatively only 
one may hybridise to the relevant gene and the other to 
another site. 

[0038] The inventors recognise in applying the 
methods of the invention for detecting combinations of O 
and H antigens to samples, that the methods do not 
indicate whether a positive result for a particular O and 
H antigen combination arises because the O and H antigen 
are present on a single E. coll strain present in the 
sample or are present on different E. coll strains present 
in the sample. Because the ability to identify the 
presence of E, coll strains with particular O and H 
antigen combinations is highly desirable (due to the 
relationship between particular combinations and 
pathogenicity) the determination that a particular 
combination is present in a sample can be followed by 
isolation of single colonies and checking whether the they 
contain the relevant combination by using the same method 
again or using antibody labelled magnetic beads to 
separate cells expressing the particular O or H antigen 
and then testing the isolated cells for the other 
serotype . 

[0039] In addition, as mentioned above, the 
present inventors have established the existence of H7 
primers specific to the 0157 and 055 serotypes. Using 
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such primers it is possible to detect particular 0 and H 
antigen combinations with the use of H specific nucleic 
acid molecules. 

[0040] In a sixth aspect the invention provides a 
method for detecting the presence of a particular O 
serotype and H serotype of E. coli in a sample, the method 
comprising the following steps: 

(a) specifically hybridising at least one 
nucleic acid molecule, derived from and specific for a 
gene encoding a flagellin associated with a particular E. 
coll H antigen serotype to any E, coll carrying the gene 
and present in the sample; and 

(b) detecting the at least one specifically 
hybridised nucleic acid molecule, wherein the at least one 
nucleic acid molecule is specific for the particular 
combination of 0 and H antigen. 

[0041] Preferably the combination is 055:H7 or 

0157:H7. 

[0042] The ability to detect the 0157 :H7 
combination from a particular H7 primer or pair is of 
particular use given the association of this combination 
with pathogenic strains . 

[0043] In a seventh aspect the present invention 
provides a method for testing a food derived sample for 
the presence of one or more particular £. coll O antigens 
and H antigens comprising testing the sample by a method 
of the fourth, fifth or sixth aspect the invention. 

[0044] In an eighth aspect the present invention 
provides a method for testing a faecal derived sample for 
the presence of one or more particular E. coll O antigens 
and H antigens comprising testing the sample by a method 
of the fourth, fifth or sixth aspect the invention. 

[0045] In a ninth aspect the present invention 
provides a method for testing a patient or animal derived 
sample for the presence of one or more particular E, coll 
O antigens and H antigens comprising testing the sample by 
a method of the fourth, fifth or sixth aspect the 
invention . 
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[0046] Preferably, the method of the seventh, 
eighth or ninth aspect of the invention is a polymerase 
chain reaction method. More preferably the 
oligonucleotide molecules for use in the method are 
labelled. Even more preferably the hybridised nucleic 
acid molecules are detected by electrophoresis - 

[0047] In the above described methods it will be 
understood that where pairs of nucleic acid molecules are 
used one of the nucleic acid molecules may hybridise to a 
sequence that is not from the O antigen transferase, wzk 
or wzy gene or the flagellin gene. Further where both 
hybridise to these genes the O antigen molecules may 
hybridise to the same or a different one of these genes. 

[0048] In a tenth aspect the present invention 
provides a kit for identifying the H serotype of collf 
the kit comprising: 

at least one nucleic acid molecule derived from 
and specific for an E. coli flagellin gene. 

[0049] In an eleventh aspect the present invention 
provides a kit for identifying the H and O serotype of £. 
coli, the kit comprising: 

(a) at least one nucleic acid molecule derived 
from and specific for an E. coli flagellin gene; and 

(b) at least one nucleic acid molecule derived 
from and specific for a gene encoding a transferase or a 
gene encoding an enzyme for the transport or processing of 
a polysaccharide or oligosaccharide unit, the gene being 
involved in the synthesis of a particular E. coli O 
antigen . 

[0050] The nucleic acid molecules may be provided 
in the same or different vials. The kit may also provide 
in the same or separate vials a second set of specific 
nucleic acid molecules. 

[0051] Particularly preferred nucleic acid 
molecules for inclusion in the kits are those specified in 
Tables 3, 8, 8A, 9 and 9A as described above. 

DEFINITIONS 
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[0052] In this specification, we have used term 
"flagellin gene" in many cases where previously one would 
have used "fiiC", to allow for the uncertainty as to locus 
introduced by recent observations- However, uncertainty as 
to the locus does not alter the fact that most coli 
strains express a single H antigen and that a single 
flagellin gene sequence per strain is required to give the 
genetic basis for H antigen variation . Any use of the 
name flic in this specification where a different locus is 
later shown to be involved would not affect the validity 
of conclusions drawn regarding application of information 
based on the sequence, where the conclusions do not relate 
to the map position. Thus it is generally the nucleic 
acid molecule itself which is of importance rather than 
the name attributed to the gene. When it is known or 
suspected that the gene encoding the H antigen is not in 
the flic locus, we use the term flagellin rather than 
flic. 

[0053] The phrase, "a nucleic acid molecule 
derived from a gene" means that the nucleic acid molecule 
has a nucleotide sequence which is either identical or 
substantially similar to all or part of the identified 
gene. Thus a nucleic acid molecule derived from a gene 
can be a molecule which is isolated from the identified 
gene by physical separation from that gene, or a molecule 
which is artificially synthesised and has a nucleotide 
sequence which is either identical to or substantially 
similar to all or part of the identified gene. While some 
workers consider only the DNA strand with the same 
sequence as the mRNA transcribed from the gene, here 
either strand is intended. 

[0054] Transferase genes are regions of nucleic 
acid which have a nucleotide sequence which encodes gene 
products that transfer monomeric sugar units. 

[0055] Flippase or wzx genes are regions of 
nucleic acid which have a nucleotide sequence which 
encodes a gene product that flips oligosaccharide repeat 
units generally composed of three to six monomeric sugar 
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units to the external surface of the membrane. 

[0056] Polymerase or p/zy genes are regions of 
nucleic acid which have a nucleotide sequence which 
encodes gene products that polymerise repeating 
oligosaccharide units generally composed of 3-6 monomeric 
sugar units. 

[0057] The nucleotide sequences provided in this 
specification are described as anti-sense sequences. This 
term is used in the same manner as it is used in Glossary 
of Biochemistry and Molecular Biology Revised Edition, 
David M. Glick, 1997 Portland Press Ltd., London on page 
11 where the term is described as referring to one of the 
two strands of double-stranded DNA usually that which has 
the same sequence as the mRNA. We use it to describe this 
strand which has the same sequence as the mRNA. 



NOMENCIiATURE 



Synonyms for coli 0111 rfb 



Current names 


Our names 




Bastin et al. 19 


wbdH 


orf 1 






gmd 


orf2 






wbdl 


orf3 




orf3.4* 


manC 


orf 4 




rfbM^ 


manB 


orfS 




rfbK* 


wbdJ 


orf 6 




orf6.7* 


wbdK 


orf7 




orf 7. 7^ 


wzx 


orf 8 




orf 8. 9 and rfbX* 


wzy 


orf 9 






wbdL 


orf 10 






wbdM 


orfll 






* Nomenclature according to Bastin 


D. A. , 


et al. 1991 "Molecular 


cloning and expression 


in Escherichia 


coli 


K-12 of the rfb gene 


cluster determining the 


0 antigen of 


an 


coli Olll strain". Mol. 



Microbiol. 5:9 2223-2231, 
Other Synonyms 



wzy rfc 

wzx rfbX 

rmlA rfbA 

rmlB rfbB 

rmlC rfbC 

rmlD rfbD 

glf orf6* 

wbbi orf3#, orf8* of coli K-12 

wbbJ orf2#, orf9* of coli K-12 
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wbbK orfl#, orflO* of coli K-12 

wbbL orf5#, orf 11* of coli K-12 

# Nomenclature according to Yao, Z. And M. A. Valvano 1994. 

"Genetic analysis of the 0-specific lipopolysaccharide biosynthesis 
region (rfb) of Eschericia coli K-12 W3110: identification of genes 
the confer groups-specif icty to Shigella flexineri serotypes Y and 
4a". J, Bacteriol. 176: 4133-4143. 

* Nomenclature according to Stevenson et al . 1994. "Structure of 
the 0-antigen of E. coli K-12 and the sequence of its rfb gene 
cluster". J. Bacteriol 176: 4144-4156. 

• The O antigen genes of many species were given rfb names ( rf bA 
etc) and the O antigen gene cluster was often referred to as the 
rfb cluster. There are now new names for the rfb genes as shown 
in the table. Both terminologies have been used herein, depending 
on the source of the information. 

In the claims that follow and in the summary of the 
invention, except where the context requires otherwise due 
to express language or necessary implication, the word 
"comprising" is used in the sense of "including", i.e. the 
features specified may be associated with further features 
in various embodiments of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0058] Figure 1 shows Eco Rl restriction maps of 
cosmid clones pPR1054, pPR1055, pPR1056, pPR1058, pPR1287 
which are subclones of E. coli 0111 O antigen gene 
cluster. The thickened line is the region common to all 
clones- Broken lines show segments that are non- 
contiguous on the chromosome. The deduced restriction 
map for E, coli strain M92 is shown above. 

[0059] Figure 2 shows a restriction mapping 
analysis of E. coli 0111 O antigen gene cluster within 
the cosmid clone pPR1058. Restriction enzymes are: (B: 
BamRl; Eg: Bglll, E: EcoRl; H: Hindlll; K: Kp/il; P: Pstl; 
S: Sail and X: Xhol . Plasmids pPR1230, pPR1231, and 
pPR1288 are deletion derivatives of pPR1058. Plasmids 
pPR 1237, PPR1238, pPR1239 and pPR1240 are in pUC19. 
Plasmids pPRl243, pPR1244, pPR1245, pPR1246 and pPR1248 
are in pUC18, and pPR1292 is in pUC19. Plasmid pPR1270 
is in pT7T319U. Probes 1, 2 and 3 were isolated as 
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internal fragments of pPR1246, pPR1243 and pPRl237 
respectively- Dotted lines indicate that subclone DNA 
extends to the left of the map into attached vector. 

[0060] Figure 3 shows the structure of E. coll 
0111 O antigen gene cluster, 

[0061] Figure 4 shows the structure of E. coli 
0157 O antigen gene cluster. 

[0062] Figure 5 shows the nucleotide sequence (SEQ 
ID NO: 45) of the E. coli 0111 O antigen gene cluster. 
Note: (1) The first and last three bases of a gene are 
underlined and of italic respectively.; (2) The region 
which was previously sequenced by Bastin and Reeves 1995 
"Sequence and anlysis of the O antigen gene (rfb) cluster 
of Escherichia coli Olll" Gene 164: 17-23 is marked- 

[0063] Figure 6 shows the nucleotide sequence (SEQ 
ID NO: 56) of the E. coli 0157 O antigen gene cluster. 
Note: (1) The first and last three bases of a gene 
(region) are underlined and of italic respectively (2) The 
region previously sequenced by Bilge et al. 1996 "Role of 
the Escherichia coli 0157-H7 O side chain in adherence and 
analysis of an rfb locus". Inf. and Immun 64:4795-4801 is 
marked. 

[0064] Figures 7 to 9 show the nucleotide 
sequences (SEQ ID NOS:66 to 68 respectively) obtained for 
flagellin genes from E. coli type strains for HI to H3 
respectively. The primer positions listed in Table 3 are 
based on treating the first nucleotide of each of these 
sequences as No. 1. 

[0065] Figures 10 to 18 show the nucleotide 
sequences (SEQ ID NOS : 6 to 14 respectively) obtained for 
flagellin genes from E. coli type strains for H4 to H12 
respectively. The primer positions listed in Table 3 are 
based on treating the first nucleotide of each of these 
sequences as No. 1. 

[0066] Figures 19 and 20 show the nucleotide 
sequences (SEQ ID NOS: 15 to 16 respectively) obtained for 
flagellin genes from E. coli type strains for H14 and H15 
respectively. The primer positions listed in Table 3 are 
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based on treating the first nucleotide of each of these 
sequences as No- 1. 

[0067] Figures 22 ^ttdto 26 show the nucleotide 
sequences (SEQ ID N0S:17 to 21 respectively) obtained for 
flagellin genes from E. coli type strains for H17 mtdto 
H21 respectively. The primer positions listed in Table 3 
are based on treating the first nucleotide of each of 
these sequences as No. 1. 

[0068] Figures 27 to 39 show the nucleotide 
sequences (SEQ ID NOS:22 to 34) obtained for flagellin 
genes from E. coli type strains for H23 to H35 
respectively. The primer positions listed in Table 3 are 
based on treating the first nucleotide of each of these 
sequences as No. 1. 

[0069] Figures 40 to 49 show the nucleotide 
sequences (SEQ ID NOS:35 to 44) obtained for flagellin 
genes from E, coli type strains for H37 to H4d 
respectively. The primer positions listed in Table 3 are 
based on treating the first nucleotide of each of these 
sequences as No. 1. 

[0070] Figures 50 to 55 show the nucleotide 
sequences (SEQ ID NOS:46 to 51) obtained for flagellin 
genes from E. coli type strains for H47 to H52 
respectively. The primer positions listed in Table 3 are 
based on treating the first nucleotide of each of these 
sequences as No. 1, 

[0071] Figures 56 to 58 show the nucleotide 
sequences (SEQ ID NOS:52 to 54) obtained for flagellin 
genes from E. coli type strains for H54 to H56 
respectively. The primer positions listed in Table 3 are 
based on treating the first nucleotide of each of these 
sequences as No. 1. 

[0072] Figure 59 shows the nucleotide sequence 
(SEQ ID NO: 55) obtained for the flagellin gene from E. 
coli H7 strain M1179- The primer positions listed in 
Table 3 are based on treating the first nucleotide of each 
of these sequences as No. 1, 

[0073] Figures 60 to 68 show the nucleotide 
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sequences (SEQ ID NOS:57 to 65 respectively) obtained for 
flagellin genes from E. coli strains M1004, M1211, M1200, 
M1686, M1328, M917, M527, M973, and M918 respectively. The 
primer positions listed in Table 3 are based on treating 
the first nucleotide of each of these sequences as No. 1. 

[0074] Figure 59 shows the nucleotide sequence 
(SEQ ID N0:1) of the ^JiC gene and DNA flanking the fliC 
gene from the H25 type strain. 

[0075] Figure 70A shows the nucleotide sequence 
(SEQ ID NO: 2) obtained from the 5' end of the insert of 
plasmid pPR1989- The insert of plasmid pPR198 9 encodes 
the second flagellin gene of the H55 type strain. 

[0076] Figure 7 OB shows the nucleotide sequence 
(SEQ ID NO: 3) obtained from the 3' end of the insert of 
plasmid pPR1989. The insert of plasmid pPRl989 encodes 
the second flagellin gene of the H55 type strain. 

[0077] Figure 71 shows the nucleotide sequence 
(SEQ ID NO: 4) obtained from the 5' end of the insert of 
plasmid pPR1993. The insert of plasmid pPR1993 encodes 
the second flagellin gene of the H36 strain. 

[0078] Figure 72 shows the nucleotide sequence 
(SEQ ID NO: 5) obtained from the 3' end of the insert of 
plasmid pPR1993. The insert of plasmid pPRl993 encodes 
the second flagellin gene of the H36 type strain. 

[0079] Figure 73 A shows the sequence of 
polylinker and the SD sequence of plasmid pTrc99A. 

[0080] Figure 73B shows the sequence of the 
junction region between the SD sequence and the start of 
flagellin gene in the plasmids used for the expression of 
flagellin genes. 

EXAMPLES OF THE INVENTION 

[0081] In carrying out the methods of the 
invention with respect to the testing of particular sample 
types including samples from food, patients, animals and 
faeces the samples are prepared by routine techniques 
routinely used in the preparation of such samples for DNA 
based testing- The steps for testing the samples using 



-19- 



U.S. Serial No. 09/701,132 

PROVISIONAL ELECTION AND PRELIMINARY AMENDMENT 
AMENDMENTS TO THE SPECIFICATION 

particular nucleic acid molecules in assay formats such as 
Southern blots and PGR are performed under routinely 
determined conditions appropriate to the sample and the 
nucleic acid molecules . 

H aii'ki.gen 

Ma-berlals and Met:hods 

I. Bacterial strains and plasmid: 

[0082] There are 54 H types in E. coli [Ewing^. 
W.H.: Edwards and Swing's identification of the 
Siiterojbacteriaceae. , Elsevier Science Publishers, 
Amsterdam;. The Netherlands, 1986] : note H antigens from 1 
to 57 were listed and that 13, 22 and 57 are not valid. 
All the standard H type strains except H16 were obtained 
from the Institute of Medical and Veterinary Science, 
Adelaide, Australia. The primary stocks are hold at the 
Statens Serum Institut, Copenhagen, Denmark - 

[0083] The additional H7 strains used are listed 
in Table 1. 

[0084] We do not have the type strain for H16. It 
is known that the H3 type strain is biphasic and can also 
express the H16 flagellin gene [Ratiner, Y. A. (1985) "Two 
genetic arrangements determining flagellar antigen 
specificities in two diphasic coli strains. FEMS 
Microbiol Lett 19: 317-323]. We have sequenced and cloned 
the H16 flagellin gene from the H3 type strain (see 
below) . 

[0085] E. coli K-12 strain C600 hsm hsr fliC: :TnlO 
[Kuwajiwa, G. (1988) "Flagellin domain that affects H 
antigenicity of E. coli K-12" J. Bacteriol. 170; 485-488] 
(laboratory stock no. M2126) was obtained from Dr Benita 
Westerlund-Wikstrom of the Department of Biosciences, 
University of Helsinkin, Finland. E. coli K-12 strain 
EJ2282 (laboratory no. P5560) is a fliC deletion strain, 
and was obtained from Dr Masatoshi Enomoto of the 
Department of Biology, Okayama University, Japan 
[Tominaga, A. M. A.-H. Mahmound, T. Mokaihara and M. 
Enomoto (1994) "Molecular characterization of intact but 
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cryptic, flagellin genes in the genus Shigella.: Mol. 
Microbiol. 12: 277-285]. 

[0086] Plasmid pTrc99A was purchased from 
Pharmacia LKB (Melbourne, VIC, Australia) . 

2. Antisera 

[0087] Antisera against HI, H3, H8, H14, H15, H17, 
H23, H24, H25, H26, H29, H30, H31, H32, H33, H35, H36, 
H37, H38, H39, H43, H44, H46, H47, H48, H49, H52, H53, 
H54, H55, and H56 were obtained from the Institute of 
Medical and veterinary Science, Adelaide, Australia. 
Antisera against H2, H4, H5, H6, H7, H9, HIO, Hll, H12, 
H16, H18, H19, H20, H21, H27, H28, H34, H40, H41, H42, 
H45, and H51 were obtained from Denka Seiken Co., Ltd, 
Tokyo, Japan. 

[0088] Antisera to type H50 was not available from 
any known source. 

[0089] The antisera available were checked against 
the appropriate type strains to confirm the specificities 
of both flagellin H antigen and H antisera: 52 sera (all 
those except anti-H16 serum listed above) gave a positive 
reaction with the corresponding type strains for that 
serum. 

3. Agglutination test: 

[0090] Bacteria from 1 ml of an overnight culture 
grown in Luria broth (Difco Tryptone, lOg/1; Difico yeast 
extract, 5g/l; NaCl, 0.5 g/1; pH 7.2) at 30oC was 
centrifuged (4000 rpm/10 min) and the bacteria pellet 
resuspended in 100 ml of saline. The agglutination test 
was carried out by mixing equal volumes (5 ml) of both the 
cells and antiserum on a slide. The slide was rocked for 1 
minute and then observed for agglutination. For all 
agglutination tests, saline containing no antiserum was 
mixed with cells to be used as a negative control - 

[0091] For testing the H specificities of strain 
M2126 or strain P5560 carrying plasmid containing cloned 
flagellin genes, cells of M2126 or P5560 were used as an 
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additional negative control. 

[0092] All agglutination tests were first carried 
out using undiluted antisera (note that the antisera we 
used have been diluted before reaching our hands) , except 
for anti-Hll, anti-H34, anti-H52 and anti-H25 serum for 
which we used 1:10 dilutions to avoid background 
agglutination. In cases for which cross-reactions have 
been reported, we carried out agglutination tests using 
serial dilutions of sera (see section 10.1) 

4. Motility test: 

[0093] The motility of strain M2126 or strain 
P5560 carrying cloned flagellin genes was examined 
microscopically. 1 ml of overnight culture grown in Luria 
broth (Difco Tryptone, lOg/1; Difico yeast extract, 5g/l; 
NaCl, 0.5 g/1; pH 7.2) at 30oC was inoculated into 10 ml 
of Luria broth^ and the culture was shaken at 100 rpm at 
30oC to early log phase (OD 625 = 0.2). A loopful of 
culture was placed on a slide and examined under a 
microscope. Motility of individual cells was easily 
distinguished from Brownian movement and streaming, and 
presence or absence of motility recorded. 

5. Isolation of chromosomal DNA: 

[0094] Chromosomal DNA from all the 53 H type 
strains and the strains listed in Table 1 was isolated 
using the Promega Genomic isolation kit (Madison WI USA) . 
Each chromosomal DNA sample was checked by gel 
electrophoresis of the DNA and by PGR amplification of the 
mdh gene using oligonucleotides based on the E. coli K-12 
mdh gene [Boyd, E.F., Nelson, K., Wang, F.-S., Whittam, 
T.S. and Selander, R.K.: Molecular genetic basis of 
allelic polymorphism in malate dehydrogenase {mdh) in 
natural populations of Escherichia coli and Salmonella 
enterics. Proc. Natl. Acad. Sci. USA 91 (1994) 1280-1284]. 

6". PCR amplification of flagellin gene: 

[0095] Flagellin genes from different strains 
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were first PGR amplified using one of the following four 
pairs of oligonucleotides: 



#1285 


(SEQ 


ID 


NO: 


77) 


(5' 


-atggcacaagtcattaatac) and 


#1286 


(SEQ 


ID 


NO: 


78) 


(5- 


-ttaaccctgcagtagagaca) ; 


#1417 


(SEQ 


ID 


NO: 


79) 


(5' 


-ctgatcactcaaaataatatcaac) and 


#1418 


(SEQ 


ID 


NO: 


80) 


(5' 


-ctgcggtacctggttggc) ; 


#1431 


(SEQ 


ID 


NO: 


81) 


(5' 


-atggcacaagtcattaatacccaac) and 


#1432 


(SEQ 


ID 


NO: 


82) 


(5' 


-ctaaccctgcagcagagaca) : 


#1575 


(SEQ 


ID 


NO: 


83) 


(5- 


-gggtggaaacccaatacg) and 


#1576 


(SEQ 


ID 


NO: 


84) 


(5* 


-gcgcatcaggcaatttgg) 



PGR reactions were carried out under the following 
conditions: denaturing, 94°G/30'; annealing, temperature 
varies (refer to Table 2)/30*; extension, 72°G/1'; 30 
cycles- The PGR product was purified using the Promega 
Wizard PGR purification kit (Madison WI USA) before being 
sequenced. 

[0096] The H36 and H53 type strains gave two PGR 
bands using primer pairs #1431/#1432 (SEQ ID NOS:81/82) 
and #1417/#1418 (SEQ ID NQS: 79/80 respectively, and were 
not sequenced. 

7. Enzymes and buffers : 

[0097] Restriction endonucleases and DNA T4 ligase 
were purchased from Boehringer Mannheim (Castle Hill, NSW, 
Australia) . Restriction enzymes were used in the 
recommended commercial buffer, 

8, Sequencing of the flagellin genes: 

[0098] Each PGR product was first sequenced using 
the oligonucleotide primers used for the PGR 
amplification. Primers based on the obtained sequence were 
then used to sequence further, and this procedure was 
repeated until the entire PGR product was sequenced. 

[0099] The sequencing reactions were performed 
using the DyeDeoxy Terminator Cycle Sequencing method 
(Applied Biosystems, GA, USA) , and reaction products were 
analysed using fluorescent dye and an ABI377 automated 
sequencer (GA, USA) . 
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[00100] Sequence data were processed and analysed 
using Staden programs [Sacchi CT, Zanella R C, Caugant D 
A, Frasch C E, Hidalgo N T, Milagres L G, Pessoa L L, 
Ramos S R, Camargo M C C and Melles C E A "Emergence of a 
new clone of serogroup C Neissorza meningitidis in Sao 
Paulo, Brazil" J. Clin. Microbiol. 30 (1992) 1282-1286; 
Staden, R. : Automation of the computer handling of gel 
reading data produced by the shotgun method of DNA 
sequencing- Nucl. Acids Res. 10 (1982a) 4731-4751; 
Staden, R, : An interactive graphics program for comparing 
and aligning nucleic acid and amino acid sequences. Nucl. 
Acids Res. 10 (1982b) 2951-2961; 

Staden, R. : Computer methods to locate signals in nucleic 
acid sequences. Nucl. Acids Res. 12 (1984a) 505-519; 
Staden, R. : Graphic methods to determine the function of 
nucleic acid sequences. A summary of ANALYSEQ options. 
Nucl. Acids Res. 12 (1984b) 521-538; 

Staden, R. : The current status and portability of our 
sequence handling software. Nucl. Acids Res. 14 (1986) 
217-231] . 

[00101] We were able to PCR amplify flagellin genes 
from H type strains for H7, 23, 12, 51, 45, 49, 19, 9, 30, 
32, 26, 41, 15, 20, 28, 46, 31, 14, 18, 6, 34, 48, 43, 10, 
52, and also from H7 strains ml004, m527, ml686, ml211, 
ml328, m973, mll79, ml200, m917, and m918 using primers 
#1575 (SSQ ID NO: 83) and #1576 (SEQ ID NO: 84) which are 
based on sequences 51-34 bp upstream and 37-54 bp 
downstream of start and end of the E. coli K-12 flic gene 
respectively. Thus, the full sequence of the flagellin 
gene from these strains was obtained and the use of 
flanking sequence for primers makes it highly likely that 
they are at the fliC locus. 

[00102] For other strains, we were only able to 
amplify the flagellin gene using one or more of the other 
three pairs of primers, which are based on sequence within 
the flic gene, and thus only partial sequence was 
obtained. These amplicons may be of the fliC gene or one 
of the alternative flagellin genes. The flagellin gene 
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sequences from H type strains for H40, 8, 21, 47, 11, 27, 
35, 2, 3, 24, 37, 50, 4, 44, 38, 55, 29, 33, 5, and 56 
obtained are lacking 18 and 14 codons at 5* and 3' ends 
respectively. The flagellin gene sequence of H39 obtained 
using primers #1285/#1286 (SEQ ID NOS:77/78) lacks 18 and 
19 codons at 5' and 3' ends respectively. The flagellin 
gene sequence of H type strains of H17, 25 and 42 lack 23 
and 21 codons at 5' and 3' ends respectively. The 
flagellin gene sequence of the H type strain for H54 lacks 
23 and 12 codons at the 5' and 3' ends respectively. There 
is very little variation in the sequence at the two ends 
of flagellin genes and antigenic variation is due to 
variation in the central region of the gene. The absence 
of sequence for the ends of some of the flagellin genes is 
not important for the purpose of the present invention 
relating to the detection of antigenic variation by DNA 
sequence based means. 

[00103] The fliC genes from H type strains of HI, 
H7 and H12 have been sequenced previously [Schoenhals, G. 
and Whitfield, C. : Comparative analysis of flagellin 
sequences from Escherichia coll strains possessing 
serologically distinct flagellar filaments with a shared 
complex surface pattern. J. Bacterid. 175 (1993) 5395- 
5402] and we did not sequence the gene from the Hi strain. 

[00104] We have sequenced fllC genes from a set of 
H7 strains with different O antigens, including that of 
flic from the H7 type strain as one of the set; we have 
found four differences from the published H7 sequence 
(GenBank accession number L07388) which we believe are due 
to errors in the published sequence. 

[00105] We have also re-sequenced the fllC gene 
from the H12 type strain, and have found one difference 
from the published H12 sequence (GenBank accession number 
L07389) which we believe is due to an error in the 
published sequence . 

[00106] The flagellin genes from type strains H35 
and H54 were also amplified using primers #1431/#1432, 
which are based on sequence within the fllC gene. 
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Sequence data revealed that these two genes would be non- 
functional due to insertion sequence inserted in the 
middle of them. We have sequenced them to facilitate 
selection of primers for the functional flagellin genes. 

5- Cloning of flagellin genes 

[00107] DNA was digested for 2 hr at 37°C with 
appropriate restriction enzyme (s). The reaction product 
was then extracted once with phenol, and twice with ether. 
DNA was precipitated with 2 vols of ethanol and 
resuspended in water before the ligation reaction was 
carried out. Ligation was carried out 0/N at 4^C and the 
ligated DNA was electroporated into one of the call 
flic mutant strains - 

9.1. Cloning of flagellin genes from type strains for HI 
H2, H3r H5, H6, H7 , H9 , HIO, Hll, H12 , H14 , H15 , H18, H19, 
H20, H21, H24, H26, H27, H28, H29, H31, H34, H38, H39 r 
H41, H42, H43, H45, H46, H49, H51, H52, and H56: 

[00108] The full flagellin gene was PGR amplified 
using primers #1868 (SEQ ID NO: 69) and #1870 (SEQ ID 
NO: 71) (Table 3A) . Both these primers are based on the 
sequences of the H7 flagellin gene of the H7 type strain. 
#1868 (SEQ ID NO: 69) is the 5' primer: there is an Ncol 
site incorporated into the primer (Table 3B) and the 
flagellin gene starts at base 3 of the Ncol site. The 3' 
primer #1870 (SEQ ID N0:71) has a BamHl site incorporated 
downstream of the stop codon of the flagellin gene (Table 
3B) . PGR reactions were carried out under the following 
conditions: denaturing, 94oC/30'; annealing, temperature 
varies (refer to Table 3A)/30'; extension, 72oC/l'; 30 
cycles. The PGR product was purified using the Promega 
Wizard PGR purification kit (Madison WI USA) before being 
digested by restriction enzymes Ncol and BamHl and cloned 
into the NcoI/BamHl sites of plasmid pTrc99A. 

[00109] Plasmid pTrc99A has a strong trc promoter 
upstream of the polylinker. Downstream of the promoter, it 
contains the ribosome binding site (SD sequence, see Fig 
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73) which is located 8bp upstream of the ATG site within 
the Ncol site. The polylinker and the SD sequence of 
pTrc99A are shown in Fig 73. 

[00110] The plasmids generated were given pPR 
numbers, and they are listed in Table 3A. In these 
plasmids, the expression module consists of the trc 
promoter, the SD sequence, and the full flagellin gene. 
The sequence of the junction region between the SD 
sequence and the start of flagellin gene is shown in Fig 
73. 

[00111] For flagellin genes from type strains for 
H6, H7, H9, HIO, H12, H14, H18, H19, H20, H26, H28, H31, 
H41, H43, H45, H46, H49, H51, and H52, we have the full 
sequence for each gene and the primer sequences (#1868 
(SEQ ID NO: 69) and #1870 (SEQ ID NO: 71) ) are conserved 
among them. The cloned genes therefore have the same 
sequence as those from the type strains. 

[00112] For flagellin genes from type strains for 
Hi, H15 and H34, we also have the full sequence. The 
previously published sequence of the flagellin gene from 
the HI type strain was extracted from GenBank (accession 
number L07387) and used. Primer #1868 (SEQ ID NO: 69) is 
conserved in all three. But, primer #1870 (SEQ ID NO: 71) 
has the third base of the fifth last codon in the HI 
sequence changed from A to G, and the third base of the 
second last codon changed from C to T in the H15 and H34 
sequences: these changes did not change the amino acid 
coded, so the cloned genes encode the same gene products 
as those of the type strains. 

[00113] For flagellin genes from type strains for 
H2, H3, H5, Hll, H21, H24, H27, H29, H38, H39, H42, and 
H56, we do not have the full sequences. In the plasmids 
carrying genes from these type strains, the expression 
module consists of the trc promoter, the SD sequence, and 
the full flagellin gene with the first and the last 21 
base pairs being determined by the primer sequences which 
are based on the H7 flagellin gene of the H7 type strain. 
The sequence of the junction region between the SD 
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sequence and the start of flagellin gene is shown in Fig 
73. 

9.2. Cloning of the flagellin gene from type strain of 
H23: 

[00114] The full flagellin gene was PGR amplified 
using primers #1868 (SEQ ID NO: 69) and #1869 (SEQ ID 
NO:70) (Table 3A) . #1868 (SEQ ID NO: 69) is the 5' primer: 
there is an Ncol site incorporated into the primer (Table 
SB) and the flagellin gene starts at base 3 of the Afcol 
site. The 3' primer #1869 (SEQ ID NO:70) has a Sail site 
incorporated downstream of the stop codon of the flagellin 
gene (Table 3B) . PGR reactions were carried out under the 
following conditions: denaturing, 94oG/30'; annealing, 
55oC/30'; extension, 72oC/l'; 30 cycles. The PGR product 
was purified using the Promega Wizard PGR purification kit 
(Madison wl USA) before being digested by restriction 
enzymes Ncol and Sail and cloned into the Ncol/Sall sites 
of plasmid pTrc99A to give plasmid pPR1942. 

[00115] Plasmid pTrc99A has a strong trc promoter 
upstream of the polylinker. Downstream of the promoter, it 
contains the ribosome binding site (SD sequence, see Fig 
73) which is located 8bp upstream of the ATG site within 
the Ncol site. The polylinker and the SD sequence of 
pTrc9 9A are shown in Fig 73. 

[00116] The expression module of pPRl942 consists 
of the trc promoter, the SD sequence, and the full 
flagellin gene. The sequence of the junction region 
between the SD sequence and the start of flagellin gene is 
shown in Fig 73. 

9.3. Cloning of flagellin genes from type strains of H30^ 
H32 and H33: 

[00117] The full flagellin gene was PGR amplified 
using primers #1868 (SEQ ID NO: 69) and #1871 (SEQ ID 
NO:72) (Table 3A) . #1868 (SEQ ID NO: 69) is the 5' primer: 
there is an Ncol site incorporated into the primer (Table 
3B) and the flagellin gene starts at base 3 of the Ncol 



-28- 



U.S. Serial No. 09/701,132 

PROVISXONAIi ELECTION AND PRELIMINARY A^dENDMENT 
AMENDMENTS TO THE SPECIFICATION 

site. The 3' primer #1871 (SEQ ID NO:72) has a Pstl site 
incorporated downstream of the stop codon of the flagellin 
gene (Table 3B) . PGR reactions were carried out under the 
following conditions: denaturing, 94oC/30'; annealing, 
temperature varies (refer to Table 3A)/30'; extension, 
72oC/l'; 30 cycles. The PGR product was purified using the 
Promega Wizard PGR purification kit (Madison WI USA) 
before being digested by restriction enzymes Ncol and Pstl 
and cloned into the Ncol/ Pstl sites of plasmid pTrc99A. 

[00118] Plasmid pTrc99A has a strong trc promoter 
upstream of the polylinker. Downstream of the promoter, it 
contains the ribosome binding site (SO sequence, see Fig 
73) which is located 8bp upstream of the ATG site within 
the Ncol site. The polylinker and the SD sequence of 
pTrc99A are shown in Fig 73. 

[00119] For flagellin genes from type strains for 
H30 and H32, we have the full sequence. Primer #1868 (SEQ 
ID NO: 69) sequence is conserved in both of them. But, 
primer #1871 (SEQ ID NQ:72) has the third base of the 
fourth last codon in both sequences changed from G to A to 
remove a Pstl site (see Table 3B) : this change did not 
change the amino acid coded. The expression module 
consists of the trc promoter, the SD sequence, and the 
full flagellin gene coding for a gene product which is 
same as that of the type strain. The sequence of the 
junction region between the SD sequence and the start of 
flagellin gene is shown in Fig 73. 

[00120] We do not have the full sequence for the 
flagellin gene from the H33 type strain. In the plasmid 
containing the H33 type strain gene, the expression module 
consists of the trc promoter, the SD sequence, and the 
full flagellin gene with the first and the last 21 base 
pairs been determined by the primer sequences which were 
used for the cloning of H30 and H32 . The sequence of the 
junction region between the SD and the start of flagellin 
gene is shown in Fig 73. 

9.4. Flagellin genos from type strains for H4 and H17: 
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[00121] For the flagellin genes of H4 and H17 type 
strains the full sequence was not obtained, and the 
sequenced parts were PGR amplified and cloned into plasmid 
pPR1951 to give in each case a gene in which the first 26 
and the last 31 codons are based on the sequence of the H7 
flagellin gene of the H7 type strain. 

9.4.1 Construction of expression plasmid vector pPR1951: 

[00122] The first 2 6 codons of the H7 flagellin 
gene was first PGR amplified using primers #18 68 (SEQ ID 
NO: 69) and #1872 (SEQ ID NO: 73) (Table 3B) . #1868 (SEQ ID 
NO: 69) is the 5' primer: there is an Ncol site 
incorporated into the primer (Table 3B) and the flagellin 
gene starts at base 3 of the Ncol site. Primer #1872 (SEQ 
ID NO: 73) was made to have the last two codons (codons 2 5 
and 26) changed from GTG TCG (Leucine and Serine) to GGA 
TOO (Glycine and Serine) to generate a BamRl site. This 
PGR fragment was digested with Ncol and BamHl before being 
cloned into the Ncol/ BamUl sites of pTrc99A to make 
plasmid pPR1949. 

[00123] The last 31 codons (including the stop 
codon) of the H7 flagellin gene was PGR amplified using 
primers #1884 (SEQ ID NO:75) and #1871 (SEQ ID NO:72) 
(Table 3A) . The 5' primer, #1884 (SEQ ID NQ:75) , has the 
first two of the 31 codons changed from TCG AAA (Serine 
and Lysine) to TGT AGA (Serine and Arginine) to generate a 
Xbal site (Table 3B) . The 3' primer #1871 (SEQ ID NO: 72) 
has a Pstl site incorporated downstream of the stop codon 
(Table 3B) . This PGR fragment was digested with Xfoal and 
Pstl, and then cloned into the Xbal/ Pstl sites of pPR1949 
to make plasmid pPR1951. 

9.4.2 Cloning of flagellin genes from the H4 and H17 type 
strains : 

[00124] The central regions of flagellin genes from 
type strains H4 and H17 were PGR amplified using primers 
#1878 (SEQ ID NO:74) and #1885 (SEQ ID NO: 76) (Table 3B) , 
which have a BamHl and a Xbal incorporated at their ends 
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respectively. PGR reactions were carried out under the 
following conditions: denaturing, 94oC/30'; annealing, 
65oC/30'; extension, 72oC/l'; 30 cycles. The PGR product 
was purified using the Promega Wizard PGR purification kit 
(Madison WI USA) before being digested by restriction 
enzymes BamRl and Xbal and cloned into the Xbal/BamRl 
sites of plasmid pPR1951 to make plasmids pPR1955 (H4) and 
PPR1957 (H17) . 

[00125] The expression module of plasmids pPR1955 
and pPR1957 consists of the trc promoter, the SD sequence, 
the first 24 codons of the H7 flagellin gene (of the H7 
type strain), 2 codons encoding Glycine and Serine, 292 or 
293 codons of the central region based on the flagellin 
gene obtained from the H4 or H17 type strain respectively, 
2 codons encoding Serine and Arginine, and then the last 
29 codons of the H7 flagellin gene (of the H7 type 
strain) . 

10. Expression of flagBllin gene plasmids in coli 
strains lacking the fliC gene^ and identification of the H 
antigens encoded by these plasmids : 

[00126] Plasmids carrying flagellin genes as 
described in section 9 (see Table 3A for a list) were 
electroporated into strains M2126 or P5560, Strains M2126 
and P5560 do not have functional fliC genes, and are not 
motile when examined under a microscope. Transf ormants 
carrying any of the plasmids listed in Table 3A are motile 
when examined under a microscope. Thus, the flagellin 
genes in all of the plasmids are expressed. 

[00127] The antigenic specificity of the flagellin 
of each transformant was then determined by slide 
agglutination . 

10.1 Flagellin genes from type strains for H2 ^ H5 ^ H6 ^ 
H7, H9r Hll, H14, H15, H18 ^ H19, H20, H21, H23, H24, H26, 
H27, H28, H29, H30^ H31, H32 , H33, H34, H39, H41, H42, 
H43, H45, H46, H49, H51, H52, and H56: 

[00128] As shown in Table 3A, strains with plasmids 
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carrying these flagellin genes expressed the same H 
antigen as their respective flagellin parent strain. 

[00129] For flagellin specificities H2, H5, H6, H7, 
H9, H14, H15, H18, H19, H20, H23, H24, H26, H27, H28, H29, 
H31, H33, H39, H51, H52, and H56, there was no cross 
reaction reported between these flagellins and flagellin 
antisera for other H antigens [Ewing^ W. H. : Edwards and 
Ewing' s identification of the fJnterojbacteriaceae . , 
Elsevier Science Publishers, Amsterdam, The Netherlands, 
1986], and we conclude that we have in each case sequenced 
the gene encoding the flagellin of the expected 
specificity from the respective type strain. 

[00130] It has been observed that cross reactions 
exist between some type strains and certain antisera at 
different levels of dilution (of the antisera) , being Hll 
with anti-H21 and anti-H40, H21 with anti-Hll, H30 with 
anti-H32, H32 with anti-H30, H34 with anti-H24 and anti- 
H31, H41 with anti"H37 and anti-H39, H42 with anti-H6, H43 
with anti-H37, H45 with anti-H20, H46 with anti-H17, and 
H49 with anti-H39 [Ewing, W. H. : Edwards and Ewing' s 
identification of the Enterobactoriaceae , , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986] . We 
have tested strain M2126 or strain P5560 carrying plasmids 
containing flagellin genes obtained from each of these 
type strains (Hll, H21, H30, H32, H34, H41, H42, H43, H45, 
H46, and H49) with the appropriate cross-reacting 
antisera . 

[00131] For strain M2126 or strain P5560 carrying 
plasmids containing flagellin genes obtained from type 
strains Hll, H34, H41, H42, H43, H45, H46, and H49, no 
cross reaction was found- We conclude that we have in each 
case sequenced the gene coding the flagellin of the 
expected specificity from the respective type strain. 

[00132] Cross reaction was observed for strain 
P5560 carrying plasmid pPR1948 (containing the flagellin 
gene obtained from the H30 type strain) with anti-H32 
serum, strain P5560 carrying pPR1940 (containing the 
flagellin gene obtained from the H32 type strain) with 
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anti-H30 serum, and strain M2126 carrying plasmid pPR1995 
(containing the flagellin gene obtained from the H21 type 
strain) with anti-Hll serum- 

[00133] We note that the reported cross reactions 
between the H30 type strain and anti-H32, the H32 type 
strain and anti-H30, and the H21 type strain and anti-Hll 
happened at a higher level of dilution (of antisera) than 
for all other type strains with the antisera mentioned 
above [Ewing, W. H. : Edwards and Ewing'^s identification of 
the Enterojbacteriaceae. , Elsevier Science Publishers, 
Amsterdam, The Netherlands, 1986] . We conclude that except 
for these three cases, the antiserum used were supplied at 
a dilution which did not exhibit cross reactions. For the 
three strains carrying flagellin genes cloned form type 
strains for H30, H32, and H21, it was necessary to further 
dilute the antiserum. 

[00134] Strain P5560 carrying plasrnid pPR1948 
(containing the flagellin gene obtained from the H30 type 
strain) agglutinates with anti-H30 serum when the 
antiserum is diluted to 1:60, but agglutinates with anti- 
H32 serum only at a dilution of 1:10 and not at a 1:20 
dilution (note that the antisera we used have been diluted 
before reaching our hands) . In contrast, strain P5560 
carrying plasmid pPR1940 (containing flagellin gene 
obtained from the H32 type strain) agglutinates with anti- 
H32 serum when the antiserum is diluted at 1:100, but 
agglutinates with anti-H30 serum only at a 1:10 dilution 
and not at a 1:10 dilution. Thus, we conclude that the 
flagellin genes we sequenced from type strains for H30 and 
H32 encode flagellins of H30 and H32 specificities 
respectively. 

[00135] Strain M2126 carrying plasmid pPR1995 
(containing the flagellin gene obtained from the H21 type 
strain) agglutinates with anti-H21 serum when the 
antiserum is diluted to 1:40, but agglutinates only with 
undiluted anti-Hll serum and not at a 1:10 dilution (note 
that the antisera we used have been diluted before 
reaching our hands) - In contrast, strain M2126 carrying 
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plasmid pPR1981 (containing flagellin gene obtained from 
the Hll type strain) did not agglutinate with anti-H21 
serum. Thus, we conclude that the flagellin genes we 
sequenced from type strains for H21 encodes flagellin of 
H21 specificity- 

10.2 Flagellin genes from type strains of HI and H12 : 

[00136] These two genes are very similar in 
sequence, with 8a .a difference between the gene products. 
It has been known that some cross-reaction exists between 
anti-H12 serum and the Hi type strain and between anti-Hl 
serum and the H12 type strain [Ewing, W. H. : Edwards and 
Ewing' s identification of the Enterobacterlaceae . , 
Elsevier Science Publishers, Amsterdam, The Netherlands, 
1986] . Strain M2126 carrying pPR1920 (carrying a flagellin 
gene from the HI type strain. Table 3A) agglutinates with 
anti-Hl serum when the antiserum is diluted to 1:100, but 
agglutinates only with undiluted anti-H12 serum and not at 
a 1:10 dilution (please note that the antisera we used 
have been diluted before reaching our hands). In contrast, 
strain M2126 carrying plasmid pPR1990 (carrying a 
flagellin gene from the H12 type strain. Table 3A) 
agglutinates with anti-H12 serum when the antiserum is 
diluted at 1:100, but agglutinates only with undiluted 
anti-Hl serum and not at a 1:10 dilution. Thus, we 
conclude that the flagellin genes we sequenced from type 
strains for Hi and H12 encode flagellins of HI and H12 
specificities respectively. 

10.3. Flagellin gene coding for HI 6: 

[00137] Strain P5560 carrying plasmid pPR1969 
agglutinated with anti-H16 serum. pPR1969 carries a 
flagellin gene amplified from the H3 type strain. It has 
been shown that this H3 type strain is a biphasic strain 
which can express H3 and H16 specificities [Ratiner, Y. A. 
(1985) "Two genetic arrangements determining flagellar 
antigen specificities in two diphasic coil strains. 
FEMS Microbiol Lett 19: 317-323]. Thus, the H3 type strain 
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has two flagellin genes coding for H3 and H16 
specificities. We conclude that we have cloned and 
sequenced the H16 flagellin gene from this H3 type strain. 

10.4 Flagellin gene coding for H4 : 

[00138] The flagellin genes obtained from type 
strains for H4 and H17 are nearly identical, with 4 a. a. 
difference in the gene products. Plasmid pPR1955 carries a 
flagellin gene from the H4 type strain, and plasmid 
pPR1957 carries a flagellin gene from the H17 type strain. 
Strain P5560 carrying plasmid pPR1955 or plasmid pPR1957 
agglutinated with anti-H4 serum, but not with anti-H17 
serum. It has been shown that the type strain for H17 is a 
biphasic strain which can express H17 and H4 [Ratiner, Y. 
A. (1985) "Two genetic arrangements determining flagellar 
antigen specificities in two diphasic E. coli strains. 
FEM3 Microbiol Lett 19: 317-323] . The flagellin gene 
obtained from type strain for H44 is also highly similar 
to that obtained from the H4 type strain, with 2 a, a. 
difference in the gene products. It has been shown that 
the H44 type strain has two complete flagellin genes, 
being H4 and H44 [Ratiner, Y. A. (1998) "New flagellin 
specifying genes in some coli strains" J- Bacteriol 
180: 979-984]. Thus, we conclude that all the three 
flagellin genes (obtained from type strains for H4, H17 
and H4 4, and sequenced) encode the H4 flagellin, and that 
the flagellin genes for H17 and H44 specificities have not 
yet been cloned. 

10.5 Flagellin gene coding for HIO: 

[00139] The flagellin genes obtained from type 
strains for HIO and H50 are nearly identical, with 3 a.a. 
difference in the gene products. Strain P5560 carrying 
plasmid pPR1923 (which carries a flagellin gene from the 
HIO type strain) agglutinated with anti-HlO serum. We 
conclude that the sequence obtained from the HIO type 
strain encodes the HIO flagellin. It is not clear if the 
sequence obtained from the H50 type strain encodes HIO or 
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H50 (see below section for H50) . 

10.6 Flagellin gene coding for H38: 

[00140] The flagellin genes obtained from type 
strains for H38 and H55 are nearly identical, with only 1 
a. a. difference in the gene products. Strain M2126 
carrying plasmid pPR1984 (carrying the flagellin gene from 
the type strain H38) agglutinated with anti-H38 serum, but 
not with anti-H55 serum. It also has been shown that the 
type strain for H55 has two complete flagellin genes 
coding for H55 and H38 specificities [Ratiner, Y, A. 
(1998) "New flagellin specifying genes in some E. coll 
strains" J. Bacterid 180: 979-984]. Thus, we conclude 
that both cloned genes encode the H38 flagellin. 

10.7 Summary: 

[00141] Flagellin genes coding for 39 H antigens 
have been identified, being those for specificities HI, 
H2, H4, H5, H6, H7, H9, HIO, Hll, H12, H14, HIS, H16, HIS, 
H19, H20, H21, H23, H24, H26, H27, H28, H29, H30, H31, 
H32, H33, H34, H38, H39, H41, H42, H43, H45, H46, H49, 
H51, H52, and H56. 

II. Comparison and alignment of the flagellin genes: 

[00142] Programs Pileup [Devereux, J., Haeberli, P. 
and Smithies, O. : A comprehensive set of sequence analysis 
programs for the VAX. Nucl . Acids Res. 12 (1984) 387-395] 
and Multicomp [Reeves, P.R., Farnell, L. and Lan, R. : 
MULTICOMP: a program for preparing sequence data for 
phylogenetic analysis. CABIOS 10 (1994) 281-284] were 
used. 

[00143] The previously published sequence of HI 
(GenBank accession number L07387) was extracted from 
GenBanlc and used. Because we did not sequence H36 and H53 
flagellin genes and we did not have the H16 type strain, 
we only compared 51 flagellin genes of H type strains and 
the flic genes from the additional 10 H7 strains. 

[00144] Among the H7 fliC genes, the percentage of 



-36- 



U.S. Serial No. 09/701,132 

PROVISIONAL ELKCTION AND PRELIMINARY AMENDMENT 
AMENDMENTS TO THE SPECIFICATION 

DNA difference ranged from 0 . 0 to 2.39%. The flagellin 
genes from type strains for H40 and H8 are identical. Some 
others are nearly identical: H21 and H47 (1.5% 
difference), H12 and HI (2.6% difference), HIO and H50 
(0.3% difference), H38 and H55 (0.1% difference), H4, H44 
and H17 are very similar, the pairwise difference ranging 
from 0.33% to 0.87%. 

[00145] For the flagellin genes obtained from type 
strains for H4, H17 and H44, we have shown that all the 
three genes encode flagellin with the H4 specificity (see 
above) . For the flagellin genes obtained from type 
strains f-rtgror H21 and H47, and H38 and H55, we have 
confirmed the specificities for one for each pair and have 
good reason to conclude that both genes of each pair 
encode the same H specificity (see above section) , being 
that for H21 and H38 specificities respectively. 

[00146] For the flagellin genes obtaitied from type 
strains for HlO and H50, we have confirmed that the one 
from the HIO type strain encodes HIO specificity- As 
these two genes are highly similar, we have presumed that 
they both encode HIO specificity. 

[00147] In the cases where the flagellin gene from 
two type strains is near identical, we conclude that both 
genes code for flagellin of the same H specificity and 
that one or other strain has an additional locus which 
carries the functional gene, although the flagellin genes 
sequenced do not appear to be mutated. 

[00148] We have shown by cloning and expression 
that the flagellin genes obtained from the Hi and H12 type 
strains encode HI and H12 specificities respectively (see 
above section) . The neucleotide difference between these 
two genes is higher at 2.6% (see above), but still within 
the normal range for variation within a gene in E. coli^ 
The two antigens cross react, and this cross reaction must 
be due to the high level similarity of the flagellins 
encoded by these two genes. 

[00149] As discussed above, genes encoding some H 
antigens have been shown to be located at loci other than 
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flic. H3, H36, H47, H53 have been shown to be at a locus 
called flkA, H44 and H55 at fllA, and H54 at flmA [Ratiner 
y A (1998) "New f lagellin-specif ying genes in some 
Escherichia coli strains" J. Bacteriol. 180 979-984]. 
However, these strains may carry a fliC in addition to 
flkAr fllA or flmA [Ratiner Y A (1998) "New flagellin- 
specifying genes in some Escherichia coli strains" J. 
Bacteriol. 180 979-984]. 

[00150] The flagellin gene encoding H48 was 
previously sequenced from E, coli strain K-12 [Kuwajima G, 
Asaka J, Fujiwara T, Node K and Kondo E "Nucleotide 
sequence of the hag gene encoding flagellin of Escherichia 
coli" J Bacteriol. 168 (1986) 1479-1483]. We have 
sequenced the fliC gene from the H4 8 type strain, and 
found that it is identical to that from K-12. 

[00151] The H54 gene is known to be at flmA 
[Ratiner Y A (1998) "New f lagellin=3pecif ying genes in some 
Escherichia coli strains" J. Bacteriol. 180 979-984] 
and the finding of a non-functional presumptive fliC locus 
in the H54 strain shows that it is present but not 
expressed. However, we have not amplified and sequenced 
the functional flmA gene of this strain. 

[00152] Using the 43 unique sequences (being the 39 
identified genes with confirmed specificities and the^ 
flagellin genes obtained from the HB (or H40) , H25, H37, 
and H48 type strains) and the sequences from the two non- 
functional flagellin genes (from H type strains H35 and 
H54) (see Table 3) we have been able to determine antigen 
specific primers for each of the H antigen specificities 
and thereby show that it is practicable to detect E.coli 
strains carrying specific H antigens without false 
positives from strains of other H types. There is no 
reason to expect that the addition of 11 sequences to the 
43 unique sequences obtained will affect the general 
conclusion, as unlike previous reports, our study covers 
flagellin sequences for a substantial majority of known E. 
coli H antigen specificities. 

[00153] Our study of 11 H7 genes from strains of 
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eight different O antigens shows limited variation which 
was such that the variation within genes for H antigens 
does not affect the ability to select antigen specific 
primers. 0:H combinations in general define a strain and 
as some of the strains thus defined were quite distant 
from each other in a study by Whittam [Whittam T S, wolfe 
M Wachsmuth I K, Orskov I and Wilson R A "Clonal 
relationships among Escherichia coli strains that cause 
hemorrhagic colitis and infantile diarrhea" Infect. Immun. 
61 (1993) 1619-1629] the variation we observe is thought 
to represent that generally present in H7 genes. We also 
obtained more than one sequences for flagellin genes for H 
specificities H4 , HIO, and H38, and again the level of 
variation within a given specif ities is very low. 
However, there is a low possibility that primers chosen 
without knowledge of the variation within genes of each H 
specificity could fail to give positive results with some 
isolates due to chance choice of primers which cover a 
base or bases which contribute to this low level 
variation. The variation within the H7 genes is in the 
normal range for variation within a gene in E. coli and if 
this possibility did occur it would be easy to use an 
alternate primer pair. For example, if a first primer in 
a primer pair is unable to hybridise to a target region 
because of low level variation in that region, a positive 
result may be achieved by using a second primer in that 
pair together with a third primer, whether or not the 
third primer is specific for the flagellin gene. Where 
the third primer is not specific for the flagellin gene, 
the specificity of the primer pair derives from the 
specificity of the second primer. The observation that 
the overall level of variation within gene for a given H 
specificity is very low making it extremely unlikely that 
the regions covered by the two primers specific for H 
specificity would both have undergone change in the same 
strain. 

[00154] There are 54 known H antigens for E. coli 
and of these there are 11 H antigen specificities for 
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which we do not as yet have sequence. It will be easy to 
determine these sequences and determine primer pairs 
specific for these H antigens by comparing these sequences 
with the 45 obtained sequences (see Table 3), and also 
modify the primers selected for any H antigen for which 
we already know the sequence in the unlikely event that 
there is a possibility of false positives with the primers 
selected- 

[00155] The sequences for the remaining H antigens 
can be obtained in one of the following ways: 

[00156] 1. where we have two bands by PGR (H36 
and H53 type strains) , we purify each and sequence^ and 
also clone each into a strain mutated in its fliC gene and 
determine the H antigen expressed by use of specific sera. 
In this way a specific sequence can be related to an H 
antigen specificity. The other band which represents an H 
antigen gene for a different specificity is expected to 
include a mutant gene or a gene similar to one of those 
for a known H specificity, but if not may represent a new 
specificity for which primer pairs could be selected. It 
may be difficult to obtain expression of flagellin genes 
when cloned from E. coli due to cloning together with 
regulatory sequences which prevent expression. This is 
easily avoided by cloning the major segment of the gene 
into a functioning fllC gene to replace the equivalent 
segment of that gene, using standard site directed 
mutagenesis to give suitable restriction sites within the 
cloned gene and incorporating those restriction sites into 
primers used to amplify the major segment of the gene to 
be studied to facilitate the cloning. We have cloned and 
sequenced the PGR bands from the H3 6 and the H55 type 
strains using this method (see section 16) . 

[00157] 2- Where two or more strains have the 
same flagellin gene sequence, the genes are cloned as 
above and the H antigen specificity represented by this 
sequence is determined. This identifies the strain in 
which the expected gene is expressed and also those 
strains for which we have sequenced a gene which is not 
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being expressed. We then clone the gene for the antigen 
expressed in these strains by making a bank of plasmid 
clones using chromosomal DNA and select for a clone which 
is expressing an H antigen different from the one 
represented by the known sequence. This can be done by 
taking advantage of the fact that the H antigen is on 
flagellin, the protein of the bacterial flagellum used for 
movement of the bacteria. In the presence of antibodies 
specific to that flagellum the bacteria cannot swim. For 
selection the clones are placed in a situation in which 
motile cells can swim away from the others and be 
collected. There are many versions of these techniques 
and any could be used. One version is to place the 
bacteria on a nutrient agar plate with reduced agar 
content such that bacteria can swim away from the site of 
inoculation. This is easily seen as growth on the plate 
and a sample of the bacteria which are motile can be 
recovered and cultivated. In this way bacteria carrying 
cloned H antigen genes can be selected. If the medium in 
the plate has antibody added to it only bacteria which 
express an H antigen different to that recognised by the 
antiserum will be able to swim. Specifically if the 
antiserum used is specific for the H antigen expressed by 
the gene for which we have sequence, only clones which 
express a different H antigen, such as those expressing 
the H antigen expressed by the H type strains used to make 
the plasmid, will be selected. Once the clone is 
obtained, the H antigen gene can be sequenced . 

[00158] Our work has shown that there are at least 
7 cases where the H antigen type strains carry two H 
antigen genes which appear to be complete and have the 
potential to function. However, while E, coli does not 
(in general) have a capacity to express more than one 
flagellin gene, it is striking that there are several loci 
for flagellin genes [Ratiner Y A (1998) "New flagellin- 
specifying genes in some Escherichia coli strains" J. 
Bacterid. 180 979-984]. Several of the pairs of H type 
strains with identical or near identical sequence do not 
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include any of the H antigen types shown by Ratiner 
[Ratiner Y A (1998) "New f lagellin-specif ying genes in some 
Escherichia coli strains" J, Bacteriol. 180 979-984] to 
map other than at fliC although these predominate- This 
suggests that there are additional cases where the 
expressed gene is not the only flagellin gene present. 
However the fact that many of the cases where we obtained 
flagellin genes of identical or near identical sequence 
and/or two flagellin genes from one strain involve type 
strains found by Ratiner [Ratiner Y A (1998) "New 
f lagellin-specif ying genes in some Escherichia coli 
strains" J. Bacterid. 180 979-984] to map away from fliC 
are among those near identical to others, indicates that 
the phenomenon is of limited extent. Nonetheless it 
remains possible even where only one gene has been 
obtained by PCR^, that it is one of a pair of flagellin 
genes, the other not being amplified by the primers used, 
and further that it is the one not amplified which is 
expressing the H antigen of the strain. It will therefore 
be necessary to clone as described above each of the 
flagellin genes we have sequenced and confirm that it 
expresses the expected antigen to ensure that the 
invention give results corresponding to those of the 
traditional serotyping scheme. In the event that it does 
not, the gene for the type antigen can be cloned and 
sequenced by the means described above. 

[00159] The 11 H7 fliC sequences fell into three 
groups, one comprising the genes from the 0157 :H7 and 
055 :H7 strains, which were identical, as expected given 
the proposed relationship between the clones. It has been 
shown that E. coli Ol57:H7 and 055 :H7 clones are closely 
related [Whittam T S, wolfe M L, Wachsmuth I K, Orskov I 
and Wilson R A "Clonal relationships among Escherichia 
coli strains that cause hemorrhagic colitis and infantile 
diarrhea" Infect. Immun. 61 (1993) 1619-1629] thus it was 
expected that the H7 fliC genes from 0157 and 055 would be 
identical. Among the H7 fliC sequences, we can identify 
primers specific to the H7 flic gene for each of the three 
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H7 groups. Two of these primers in combination with an H7 
specific primer gave two primer pairs specific for the H7 
gene of from the 0157:H7 and 055 :H7 clones. 

13. Specific oligonucleotide primers for each of the 43 
flagellin genes 

[00160] Two oligonucleotide primers were chosen 
based on each of the 43 sequences. None of them had more 
than 85% identity with any other of 61 flagellin gene 
sequences. Thus, these primers are specific for each H 
type. These primers are listed in Table 3. 

[00161] The flagellin gene of the H54 type strain 
is a mutated gene. It has an insertion sequence (IS1222) 
inserted into a normal flagellin gene of H21- Thus, 
primers for H21 would amplify a fragment of different size 
in H54. We also provide 2 primers based on the insertion 
sequence (see H54 row in Table 3) , and the use of one of 
them in combination with one of the H21 primers will 
generate a PGR band only in H54, which will also 
differentiate those strain carrying the mutated H21 gene 
from those expressing the H21 flagellin gene. 

[00162] The flic gene of H35 type strain is also a 
mutated gene. It has an insertion sequence (ISl) inserted 
into a normal flagellin gene of Hll. Thus, primers for 
Hll would amplify a fragment of different size in H35. We 
also provide 2 primers based on the insertion sequence 
(see H35 row in Table 3) , and the use of one of them in 
combination with one of the Hll primers will generate a 
PGR band only in H35, which will also differentiate those 
strain carrying the mutated Hll gene from those expressing 
the Hll flagellin gene. 

14. Testing of the H7 specific oligonucleotide primers 

[00163] Primer pair #1806/#1809 (SEQ ID NO: 9) (see 
Table 3) was used to carry out PGR on chromosomal DNA 
samples of all the 54 H type strains and the H7 strains 
listed in Table 1. PGR reactions were carried out under 
the following conditions: denaturing, 94°G/30'; annealing. 
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58°C/30'; extension, 72''C/1'; 30 cycles. PGR reaction was 
carried out in a volume of 50ul for each of the 
chromosomal sample. After the PGR reaction, 5jj,l PGR 
product from each sample was run on an agarose gel to 
check for amplified DNA. 

[00164] Primer pairs #1806/#1809 (SEQ ID NO: 9) 
produced a band of predicted size with all the 11 strains 
expressing H7, but gave no band with other H type strains. 
Thus, these primers are H7 specific. 

15. Testing of oligonucleotide primers specific to H7 of 
015 7 and 055: 

[00165] Based on a comparison of the fliC sequences 
of 11 different H7 strains, we have identified two 
oligonucleotides [#1696 (SEQ ID NQ:85) 

(5'-GGCCTGAGTGAGGGGGCC) at positions 178 to 195 in M527 
and #1697 (SEQ ID NO: 86) ( 5 ' -GAGTTACCGGCCTGCTGA) positions 
1700-1683 in M527] which are unique to H7 of 0157 and 055. 

Although not identical to any parts of the fliC sequences 
of any other H7 strains, these two primers are identical 
or have high level similarity to fliC genes of some other 
H types. However a combination of one of these primers 
with one of the H7 specific primers can give specificity 
for H7 of 0157 :H7 and 055 :H7 E. coll. 

[00166] Primer pairs #1696/#1809 (SEQ ID 
NOS:85/9) and #1697/#1806 (SEQ ID NOS : 8 6/9 ) were used to 
carry out PGR on chromosomal DNA samples of all the H type 
strains and the H7 strains listed in Table 1. PGR 
reactions were carried out under the following conditions: 
denaturing, 94°C/30'; annealing, 61°C/30' (for 
#1696/#1809) (SEQ ID NOS:85/9) or 60°G/30 ' (for#1697/#1806) 

(SEQ ID NOS: 86/9) ; extension, 72°C/1'; 30 cycles. PGR 
reaction was carried out in an volume of 50|j,l for each of 
the chromosomal samples. After the PGR reaction, 5jj,l PGR 
product from each sample was run on an agarose gel to 
check for amplified DNA. 

[00167] Both primer pairs produced a band of 
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predicted size with both of the 0157 :H7 strains (strains 
M1004 and M527, see Table 1), and the 055:H7 strain 
(strain M1686, see Table 1), but gave no band with other 
strains- Thus, these two pairs of primers are specific to 
H7 genes of 0157 :H7 and 055 :H7 E. coll strains. 

16. Identification of flagellln genes for the remaining 
15 H specificities , 

16.1. Sequencing the potential flkA gene coding for the 
H36 flagellln: 

[00168] Using primers #1431 (5'- atg gca caa gtc 
att aat acc caa c) (SEQ ID NQ:81) and #1432 (5'- eta acc 
ctg cag cag aga ca) (SEQ ID NQ:82) , we have amplified two 
bands from the H36 type strain. PGR reaction was carried 
out under the following conditions: denaturing, 94oC/30*; 
annealing, 57oC/30'; extension, 72oC/l'; 30 cycles. These 
two PGR fragments were then cloned into the pGEM-T vector 
using the Promega pGEM-T cloning kit (Madison WI USA) to 
make plasmids pPR1992 and pPR1993. Inserts from both 
plasmids were first sequenced using the M13 universal 
primers (which bind to the pGEM-T DNA flanking the 
insertion site) . For pPR1992, primers based on the 
sequence obtained were then used to sequence further, and 
this procedure was repeated until the insert was fully 
sequenced - 

[00169] The sequence of the insert of pPR1992 is 
identical to that of the H12 flagellln gene sequence 
except perhaps for the first 8 and last 7 codons which are 
encoded by the PGR primers in plasmid pPR1992. We have 
only sequenced the two ends of the insert of plasmid 
pPR1993 (Figures 71 and 72) , and the sequences of the two 
ends of the insert of pPR1993 are very similar to ends of 
other sequenced flagellln genes. We conclude that the 
insert of plasmid pPR1993 encodes a flagellln gene. The 
full sequence of the insert of plasmid pPR1993 can be 
obtained using the same method as for the sequencing of 
the insert of plasmid pPR1992. It is known that flkA 
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gene encodes the H36 flagellin [Ratiner, Y. A. (1998) "New 
flagellin specifying genes in some E. coli strains" J. 
Bacteriol 180: 979-984], and it is highly likely that 
plasmid pPR1993 contains the flkA gene of the H36 type 
strain. H specificities can be confirmed by slide 
agglutination . 

[00170] The currently uncharacterised sequence of 
both ends and of DNA flanking these two sequenced genes 
can be obtained by PGR walking and sequencing. Methods for 
PGR walking from a known sequence to an unknown region in 
chromosomal DNA are available (see [Siebert, P. D., A, 
Chenchi, D. E. Kellogg, A. Lukyanov and S. A. Lukyanov 
(1995) "An improved PGR method for walking in uncloned 
genomic DNA." Nuc. Acids Res. 23: 1087-1088]). 

[00171] The sequenced genes then can be PGR 
amplified and cloned using the method (s) described in 
section 9. Flagellins expressed by strain M212 6 carrying 
these plasmids then can be determined by use of specific 
sera . 

[00172] The sequences flanking the flkA gene can 
then be used to PGR amplify other flkA genes (see below) . 

16.2 The flkA genes coding for H3 , H47 and H53: 

[00173] It has been shown that flagellins H3, H47 
and H53 are encoded by flkA genes in the type strains 
[Ratiner, Y. A. (1998) "New flagellin specifying genes in 
some E. coli strains" J. Bacteriol 180: 979-984]. These 
genes can be PGR amplified using primers based on the 
sequences flanking the flkA gene in the H36 type strain. 
These PGR fragments can then be sequenced, and the genes 
expressed in strain M2126 for the identification of these 
genes . 

16.3 The fllA genes coding for H44 and H55: 

[00174] It is known that flagellins H44 and H55 are 
coded by fllA genes. 

16.3.1 The H55 flagellin gene: 
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[00175] Using primers #1868 (SEQ ID NO: 69) and 
#1870 (SEQ ID NO: 71) (Table 3B) , we have amplified two 
bands from the H55 type strain. PGR reaction was carried 
out under the following conditions: denaturing, 94oC/30'; 
annealing, 50oC/30'; extension, 72oC/l'; 30 cycles. These 
two PGR fragments were then cloned into the pGEM-T vector 
using the Promega pGEM-T cloning kit (Madison WI USA) to 
make plasmids pPR1994 and pPR1989. Inserts from both 
plasmids were first sequenced using the M13 universal 
primers (which bind to the pGEM-T DNA flanking the 
insertion site) . Primers based on the sequence obtained 
were then used to sequence further, and this procedure was 
repeated until both inserts were fully or partly 
sequenced. 

[00176] The sequence of the insert of pPR1994 is 
highly similar to that of the flagellin gene of the H38 
type strain, with 1 amino acid difference in the gene 
products- We have only sequenced the two ends of the 
insert of plasmid pPR1989 (figures 70A and 70B) , and the 
sequences of the two ends of the insert of pPR1989 are 
very similar to ends of other sequenced flagellin genes . 
We conclude that the insert of plasmid pPRl989 encodes a 
flagellin gene. The full sequence of the insert of 
plasmid pPR1989 can be obtained using the same method as 
for the sequencing of the insert of plasmid pPR1994, It 
is known that the H55 type strain carries flagellin genes 
for both H38 and H55, and that the H55 flagellin gene is 
at the fllA locus [Ratiner, Y. A. (1998) "New flagellin 
specifying genes in some E. coli strains" J. Bacteriol 
180: 979-984]. Thus, it is highly likely that plasmid 
pPRl989 contains the fllA gene of the H55 type strain. 

[00177] The currently uncharacterised sequence of 
both ends and of DNA flanking these two sequenced genes 
can be obtained by PGR walking and sequencing. Methods for 
PGR walking from a known sequence to an unknown region in 
chromosomal DNA are available (see [Siebert, P. D. , A. 
Ghenchi, D. E. Kellogg, A. Lukyanov and S- A. Lukyanov 
(1995) "An improved PGR method for ^walking in uncloned 
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genomic DNA. " Nuc. Acids Res. 23: 1087-1088]). 

[00178] The sequenced genes then can be PGR 
amplified and cloned using the method (s) described in 
section 9. Flagellins expressed by strain M2126 carrying 
these plasmids then can be determined by use of specific 
sera . 

16.3.2 The H44 flagellin gene: 

[00179] The sequence information for DNA flanking 
the fllA gene in the H55 type strain can then be used to 
PGR, sequence and identify the fllA gene in the H4 4 type 
strain. 

16.4 The flmA gene coding for H54: 

[00180] This gene can be cloned by making a bank of 
plasmid clones in strain M2126 using chromosomal DNA of 
the H54 type strain and selecting for a transformant which 
is motile on an agar plate. This is done by taking 
advantage of the fact that the H antigen is on flagellin, 
the protein of the bacterial flagellum used for movement 
of the bacteria. Strain M2126 lacks flagellin. Once the 
clone (s) is obtained and identified by use of anti-H54 
serum, the flagellin gene can be sequenced. It is possible 
that clones expressing different flagellin specificities 
can be obtained, and each of them can be identified by 
using different sera. 

16.5 The flagellin genes obtained from the H37 and H48 
type strains : 

[00181] We have used primers #1868 (SEQ ID NO: 69) 
and #1869 (SEQ ID NO: 70) (both were based on the sequence 
obtained from the H48 type strain, also see section 9) and 
primers #1868 (SEQ ID NO: 69) and #1870 (SEQ ID NQ:71) 
(both were based on the sequences of the H7 flagellin gene 
of the H7 type strain, also see section 9) to PGR amplify 
and clone the sequenced flagellin genes from the H4 8 and 
H37 type strains respectively- Strain P5560 carrying the 
plasmid containing either the cloned gene was not motile 
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and did not react with the appropriate antisera. It is 
highly likely that mutaions have occured due to PGR 
errors. This can be resolved by re-amplification and re- 
cloning of the genes. 

16.6 The flagellln gene obtained from the H25 type 
strain: 

[00182] The flagellln gene sequence we first 
obtained from the H25 type strain lacks 23 and 21 codons 
at 5' and 3' ends respectively. We could not amplify the 
full gene from the H25 type strain using primers based on 
the H7 flagellln gene of the H7 type strain, and it was 
necessary to get the full sequence of this flagellln gene 
by other means. 

[00183] We have used primers (#2650 (SEQ ID NO: 87) : 
5* - cag cga tga aat act tgc cat and #2648 (SEQ ID NO: 88) : 
5' - caa tgc ttc gtg acg cac) based on the genes (fliD and 
fllA respectively) flanking fliC gene in E. coll K-12 
[Blattner, F. R., G. I. Plunkett, C. A. Bloch, N. T. 
Perna, V. Burland, M. Riley and et al . (1997) "The 
complete genome sequence of E- Coli Kil2" Science 277: 
1453-1474] and primers (#2658 (SEQ ID NO: 89) : 5' - gcc tga 
gtc aga cct ttg and # 2653 (SEQ ID NO: 90) : 5' - aac ctg 
tct gaa gcg cag) based on the flagellln sequence obtained 
from the H25 type strain to PGR amplify both ends of the 
flagellln gene. The PGR product was then sequenced, and we 
have now obtained the full flagellln gene sequence and 
sequence for the DNA flanking the flagellln gene from type 
strain H25 (Figure 69) . Now, it is straightforward to PGR 
amplify, clone and express, and identify this gene using 
the methods described in sections 9 and 10. 

15.7 The flagellln genes obtained from the H8 and H40 
type strains: 

[00184] The flagellln gene sequences obtained from 
both the H8 and H40 type strains lack 18 and 15 codons at 
5* and 3' ends respectively. We have used primers based on 
the H7 flagellln gene of the H7 type strain to PGR amplify 
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and clone the full genes from these two strains. Strain 
M2126 carrying plasmid made this way was not motile under 
microscope and did not react with the appropriate 
antisera. This could be due to PGR errors as mentioned in 
section 16.5 or perhaps the first and last few amino acids 
encoded by the primers (based on H7 flagellin gene) are 
uncompatible in this case. 

[00185] The full sequence of the full gene can be 
obtained using method described in section 16.6. The 
flagellin gene can then be PGR amplified, cloned and 
expressed, and identified using the methods described in 
sections 9 and 10. 

[00186] The gene products of the flagellin genes 
obtained from the H8 and H40 type strains are identical. 
Thus, one of these two H specificities must be encoded by 
a unknown gene, and it can be cloned and identified using 
the method described in the section 16.8, 

16,8 Flagellin genes coding for H17^ H35r and HBO: 

[00187] As mentioned above, the sequenced flagellin 
genes from the H17 and H50 type strains encode H4 and HIO 
specificities respectively. The flagellin gene sequence 
obtained from the H35 strain has an insertion and encodes 
a non-functional gene (see section 8) . Thus, genes coding 
for these flagellins have not been identified, and their 
location is unknown. One can use primers based on DNA 
flanking fliC, fllA, flkA, and flmA to do PGR on the type 
strain for each of the flagellin antigen. PGR products can 
then be sequenced, and possible genes can be cloned, 
expressed and identified then. 

[00188] If the target gene is not PGR amplified 
using primers based on sequence of these loci or sequence 
flanking these loci, it can be cloned by making a bank of 
plasmid clones in strain M212 6 using chromosomal DNA of 
the type strain and selecting for a transformant which is 
motile on an agar plate. This is done by taking advantage 
of the fact that the H antigen is on flagellin, the 
protein of the bacterial flagellum used for movement of 
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the bacteria. Strain M2126 lacks flagellin. Once the 
clone (s) is obtained and identified by use of antisera, 
the flagellin gene can be sequenced. It is possible that 
clones expressing different flagellin antigens can be 
obtained, and each of them can be identified by using 
different antisera. Antiserum for H50 can be prepared 
using standard methods [Ewing, W.H.:Edwards and Ewing' s 
identification of the EnteroJbacteriaceae, , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 198 6] . 

O antigen 

Materials and Methods-part 1 

[00189] The experimental procedures for the 
isolation and characterisation of the E, call 0111 O 
antigen gene cluster (position 3,021-9,981) are according 
to Bastin D.A. , et ai. 1991 "Molecular cloning and 
expression in Escherichia coli K-12 of the rfjb gene 
cluster determining the O antigen of an E. coli 0111 
strain". Mol. Microbiol. 5:9 2223-2231 and Bastin D.A. 
and Reeves, P.R. 1995 "Sequence and analysis of the O 
antigen gene (rfb) cluster of Escherichia coli 0111". 
Gene 164: 17-23. 

A. Bacterial strains and growth media 

[00190] Bacteria were grown in Luria broth 
supplemented as required. 

B. Cosmids and phage 

[00191] Cosmids in the host strain x2819 were 
repackaged in viv^o. Cells were grown in 250mL flasks 
containing 30mL of culture, with moderate shaking at 30^*0 
to an optical density of 0.3 at 580 nm. The defective 
lambda prophage was induced by heating in a water bath at 
45°C for 15min followed by an incubation at 37 °C with 
vigorous shaking for 2hr. Cells were then lysed by the 
addition of 0 . 3mL chloroform and shaking for a further 
lOmin. Cell debris were removed from ImL of lysate by a 
5min spin in a microcentrifuge, and the supernatant 
removed to a fresh microfuge tube. One drop of 
chloroform was added then shaken vigorously through the 
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tube contents. 

C. DNA preparation 

[00192] Chromosomal DNA was prepared from bacteria 
grown overnight at 37 °C in a volume of 30inL of Luria 
broth. After harvesting by centrif ugation, cells were 
washed and resuspended in lOmL of 50mMTris-HCl pH 8.0. 
EDTA was added and the mixture incubated for 20min. Then 
lysozyme was added and incubation continued for a further 
lOmin. Proteinase K, SDS, and ribonuclease were then 
added and the mixture incubated for up to 2hr for lysis 
to occur. All incubations were at 37 °C- The mixture was 
then heated to 65°C and extracted once with 8mL of phenol 
at the same temperature. The mixture was extracted once 
with 5mL of phenol/chlorof orm/iso-amyl alcohol at 4°C. 
Residual phenol was removed by two ether extractions. 
DNA was precipitated with 2 vols, of ethanol at 4°C, 
spooled and washed in 7 0% ethanol, resuspended in l-2mL 
of TE and dialysed. Plasmid and cosmid DNA was prepared 
by a modification of the Birnboim and Doly method 
[Birnboim, H. C. and Doly, J. (1979) "A rapid alkaline 
extraction procedure for screening recombinant plasmid 
DNA" Nucl. Acid Res. 7:1513-1523]. The volume of culture 
was lOmL and the lysate was extracted with 
phenol/ chlorof orm/iso-amyl alcohol before precipitation 
with isopropanol. Plasmid DNA to be used as vector was 
isolated on a continuous caesium chloride gradient 
following alkaline lysis of cells grown in IL of culture. 

D. Enzymes and buffers - 

[00193] Restriction endonucleases and DNA T4 
ligase were purchased from Boehringer Mannheim (Castle 
Hill, NSW, Australia) or Pharmacia LKB (Melbourne, VIC 
Australia) . Restriction enzymes were used in the 
recommended commercial buffer. 

E. Construction of a gene bank. 

[00194] Individual aliquots of M92 chromosomal DNA 
(strain Stoke W, from Statens Serum Institut, 5 
Artillerive j , 2300 Copenhagen S, Denmark) were partially 
digested with 0.2U Sau3Al for l-15mins. Aliquots giving 



-52- 



U.S. Serial No. 09/701,132 

PROVISIONAL ELECTION AND PRELIMINARY AMENDMENT 
AMENDMENTS TO THE SPECIFICATION 

the greatest proportion of fragments in the size range of 
approximately 40-50kb were selected and ligated to vector 
PPR691 previously digested with BamHl and PvuII. Ligation 
mixtures were packaged in vitro with packaging extract. 
The host strain for transduction was x2819 and 
recombinants were selected with kanamycin. 

F. Serological procedures. 

[00195] Colonies were screened for the presence of 
the 0111 antigen by immunoblotting . Colonies were grown 
overnight;, up to 100 per plate then transferred to 
nitrocellulose discs and lysed with 0 . 5N HCl . Tween 20 
was added to TBS at 0.05% final concentration for 
blocking, incubating and washing steps. Primary antibody 
was E. coli O group 111 antiserum, diluted 1:800, The 
secondary antibody was goat anti-rabbit IgG labelled with 
horseradish peroxidase diluted 1:5000. The staining 
substrate was 4-chloro-l-napthol . Slide agglutination 
was performed according to the standard procedure. 

G. Recombinant DNA methods. 

[00196] Restriction mapping was based on a 
combination of standard methods including single and 
double digests and sub-cloning. Deletion derivatives of 
entire cosmids were produced as follows: aliquots of 
1 . 8mg of cosmid DNA were digested in a volume of 2 0ml 
with 0.25U of restriction enzyme for 5-80min. One half 
of each aliquot was used to check the degree of digestion 
on an agarose gel. The sample which appeared to give a 
representative range of fragments was ligated at 4''C 
overnight and transformed by the CaCl2 method into JM109. 

Selected plasmids were transformed into sfl74 by the 
same method. P4657 was transformed with pPR1244 by 
elect r operation . 

H. DNA hybridisation 

[00197] Probe DNA was extracted from agarose gels 
by electroelution and was nick-translated using [a-32P]- 
dCTP. Chromosomal or plasmid DNA was electrophoresed in 
0.8% agarose and transferred to a nitrocellulose 
membrane. The hybridisation and pre-hybridisation 
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buffers contained either 30% or 50% formamide for low and 
high stringency probing respectively. Incubation 
temperatures were 42 °C and 37 °C for pre-hybridisation and 
hybridisation respectively. Low stringency washing of 
filters consisted of 3 x 20min washes in 2 x SSC and 0.1% 
SDS. High-stringency washing consisted of 3 x 5min 
washes in 2 x SSC and 0.1% SDS at room temperature, a Ihr 
wash in 1 x SSC and 0.1% SDS at 58 ""C and 15min wash in 
0-1 X SSC and 0.1% SDS at 58°C. 

I. Nucleotide sequencing of E. coli 0111 O antigen gene 
cluster (position 3,021-9,981) 

[00198] Nucleotide sequencing was performed using 
an ABI 373 automated sequencer (CA, USA) . The region 
between map positions 3.30 and 7.90 was sequenced using 
uni-directional exonuclease III digestion of deletion 
families made in PT7T3190 from clones pPR1270 and 
pPR1272. Gaps were filled largely by cloning of selected 
fragments into M13mpl8 or Iyil3mpl9. The region from map 
positions 7.90-10.2 was sequenced from restriction 
fragments in M13mpl8 or M13mpl9. Remaining gaps in both 
the regions were filled by priming from synthetic 
oligonucleotides complementary to determined positions 
along the sequence, using a single stranded DNA template 
in M13 or phagemid. The oligonucleotides were designed 
after analysing the adjacent sequence. All sequencing 
was performed by the chain termination method. Sequences 
were aligned using SAP [Staden, R. , 1982 "Automation of 
the computer handling of gel reading data produced by the 
shotgun method of DNA sequencing", Nuc. Acid Res. 10: 
4731-4751; Staden, R. , 1986 "The current status and 
portability of our sequence handling software". Nuc. Acid 
Res, 14: 217-231]. The program NIP [Staden, R. 1982 "An 
interactive graphics program for comparing and aligning 
nucleic acid and amino acid sequence". Nuc. Acid Res, 
10: 2951-2961] was used to find open reading frames and 
translate them into proteins. 

J. Isolation of clones carrying E, coli 0111 O antigen 
gene cluster 
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[00199] The E. coli O antigen gene cluster was 
isolated according to the method of Bastin D.A., et al. 
[1991 "Molecular cloning and expression in Escherichia 
coli K-12 of the rfh gene cluster determining the O 
antigen of an E. coli 0111 strain". Mol - Microbiol. 5(9), 
2223-2231] . Cosmid gene banks of M92 chromosomal DNA 
were established in the in vivo packaging strain x2819. 
From the genomic bank, 3.3 x 10"^ colonies were screened 
with E.coli 0111 antiserum using an immuno-blotting 
procedure: 5 colonies (pPRl054, pPRl055, pPR1056, pPR1058 
and pPR1287) were positive. The cosmids from these 
strains were packaged in vivo into lambda particles and 
transduced into the E. coli deletion mutant Sfl74 which 
lacks all O antigen genes. In this host strain, all 
plasmids gave positive agglutination with 0111 antiserum. 

An Eco Rl restriction map of the 5 independent cosmids 
showed that they have a region of approximately 11.5kb in 
common (Figure 1) . Cosmid pPRlOSB included sufficient 
flanking DNA to identify several chromosomal markers 
linked to O antigen gene cluster and was selected for 
analysis of the O antigen gene cluster region. 
K, Restriction mapping of cosmid pFR1058 

[00200] Cosmid pPRlOSB was mapped in two stages. 
A preliminary map was constructed first, and then the 
region between map positions 0.00 and 23-10 was mapped in 
detail, since it was shown to be sufficient for 0111 
antigen expression. Restriction sites for both stages 
are shown in Figure 2 . The region common to the five 
cosmid clones was between map positions 1.35 and 12.95 of 
pPR1058 . 

[00201] To locate the O antigen gene cluster 
within pPR1058, pPRlOSB cosmid was probed with DNA probes 
covering O antigen gene cluster flanking regions from 5. 
enterica LT2 and E.coli K-12. Capsular polysaccharide 

(cps) genes lie upstream of O antigen gene cluster while 
the gluconate dehydrogenase ignd) gene and the histidine 

[his) operon are downstream, the latter being further 
from the O antigen gene cluster. The probes used were 
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pPR472 (3-35kb), carrying the gnd gene of LT2, pPR685 
(5.3kb) carrying two genes of the cps cluster, cpsB and 
cpsG of LT2, and K350 (16.5kb) carrying all of the his 
operon of K-12. Probes hybridised as follows: pPR472 
hybridised to 1.55kb and 3.5 kb (including 2.7 kb of 
vector) fragments of Pstl and i^indlll double digests of 
pPR1246 (a Hindlll/EcoRl subclone derived from pPR1058, 
Figure 2), which could be located at map positions 12.95- 
15.1; pPR685 hybridised to a 4 . 4 kb EcoRl fragment of 
pPR1058 (including 1.3 kb of vector) located at map 
position 0.00-3.05; and K350 hybridised with a 32kb EcoRl 
fragment of pPRlOSB (including 4.0kb of vector), located 
at map position 17.30-45.90. Subclones containing the 
presumed gnd region complemented a gnd'edd" strain 
GB23152. On gluconate bromothymol blue plates,. pPR1244 
and pPR1292 in this host strain gave the green colonies 
expected of a gnd'^edd" genotype . The hls'^ phenotype was 
restored by plasmid pPR1058 in the his deletion strain 
Sfl74 on minimal medium plates, showing that the plasmid 
carries the entire his operon, 

[00202] It is likely that the O antigen gene 
cluster region lies between gnd and cps, as in other E. 
coli and S. enterics strains, and hence between the 
approximate map positions 3.05 and 12.95. To confirm 
this, deletion derivatives of pPRl058 were made as 
follows: first, pPR1058 was partially digested with 
Hindlll and self ligated. Transf ormants were selected 
for kanamycin resistance and screened for expression of 
0111 antigen. Two colonies gave a positive reaction. 
EcoRl digestion showed that the two colonies hosted 
identical plasmids, one of which was designated pPR1230, 
with an insert which extended from map positions 0.00 to 
23.10. Second pPRl058 was digested with Sail and 
partially digested with Xhol and the compatible ends were 
re-ligated. Transf ormants were selected with kanamycin 
and screened for 0111 antigen expression. Plasmid DNA 
of 8 positively reacting clones was checked using EcoRl 
and Xhol digestion and appeared to be identical. The 
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cosmid of one was designated pPR1231. The insert of 
pPR1231 contained the DNA region between map positions 
0.00 and 15.10. Third, pPR1231 was partially digested 
with Xholf self-ligated, and transf ormants selected on 
spectinomycin/ streptomycin plates. Clones were screened 
for kanamycin sensitivity and of 10 selected, all had the 
DNA region from the Xi^ol site in the vector to the Xhol 
site at position 4.00 deleted. These clones did not 
express the Olll antigen, showing that the Xhol site at 
position 4.00 is within the O antigen gene cluster. One 
clone was selected and named pPRl288. Plasmids pPR1230, 
pPR1231, and pPR1288 are shown in Figure 2. 

L. Analysis of the E. coli 0111 O antigen gene 
cluster (position 3^021-9^981) nucleotide sequence data 

[00203] Bastin and Reeves [1995 "Sequence and 
analysis of the O antigen gene (rfh) cluster of 
Escherichia coli Olll". Gene 164: 17-23] partially 
characterised the E.coli Olll O antigen gene cluster by 
sequencing a fragment from map position 3,021-9,981, 
Figure 3 shows the gene organisation of position 3,021- 
9,981 of E. coli Olll O antigen gene cluster. orf3 and 
orf6 have high level amino acid identity with wcaH and 
wcaG (46-3% and 37.2% respectively), and are likely to be 
similar in function to sugar biosynthetic pathway genes 
in the E. coli K-12 colanic gene cluster. orf4 and orfS 
show high levels of amino acid homology to manC and inanB 
genes respectively- orfl shows high level homology with 
rfbH which is an abequose pathway gene. orfS encodes a 
protein with 12 transmembrane segments and has similarity 
in secondary structure to other wzx genes and is likely 
therefore to be the O antigen flippase gene. 

Materials and Methods-part 2 

A. Nucleotide sequencing of 1 to 3,020 and 9,982 to 
14,516 of the E, coli Olll O antigen gene cluster 

[00204] The sub clones which contained novel 
nucleotide sequences, pPR1231 (map position 0 and 1,510), 
pPR1237 (map position -300 to 2,744), pPRl239 (map 
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position 2,744 to 4,168), pPR1245 (map position 9,736 to 
12,007) and pPR1246 (map position 12,007 to 15,300) 
(Figure 2), were characterised as follows: the distal 
ends of the inserts of pPR1237, pPRl239 and pPRl245 were 
sequenced using the M13 forward and reverse primers 
located in the vector, PGR walking was carried out to 
sequence further into each insert using primers based on 
the sequence data and the primers were tagged with M13 
forv7ard or reverse primer sequences for sequencing. This 
PGR walking procedure was repeated until the entire 
insert was sequenced. pPR1246 was characterised from 
position 12,007 to 14,516. The DNA of these sub clones 
was sequenced in both directions. The sequencing 
reactions were performed using the dideoxy termination 
method and thermocycling and reaction products were 
analysed using fluorescent dye and an ABI automated 
sequencer (CA, USA) . 

B. Analysis of the E. coli 0111 O antigen gene cluster 
(positions 1 to 3,020 and 9,982 to 14,516 of Figure 5) 
nucleotide sequence data 

[00205] The gene organisation of regions of E. 
coli 0111 O antigen gene cluster which were not 
characterised by Bastin and Reeves [1995 "Sequence and 
analysis of the O antigen gene (rfjb) cluster of 
Escherichia coli 0111." Gene 164: 17-23], (positions 1 to 
3,020 and 9,982 to 14,516) is shown in Figure 3. There 
are two open reading frames in region 1 . Four open 
reading frames are predicted in region 2. The position 
of each gene is listed in Table 9. 

[00206] The deduced amino acid sequence of orjfl 
iwbdH) shares about 64% similarity with that of the rfp 
gene of Shigella dysenteriae. Rfp and WbdH have very 
similar hydrophobicity plots and both have a very 
convincing predicted transmembrane segment in a 
corresponding position. rfp is a galactosyl transferase 
involved in the synthesis of LPS core, thus wbdH is 
likely to be a galactosyl transferase gene. orf2 has 
85.7% identity at amino acid level to the gmd gene 
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identified in the E. coli K-12 colanic acid gene cluster 
and is likely to be a gmd gene. orf9 encodes a protein 
with 10 predicted transmembrane segments and a large 
cytoplasmic loop. This inner membrane topology is a 
characteristic feature of all known 0 antigen polymerases 
thus it is likely that orf9 encodes a 0 antigen 
polymerase gene, wzy- orflO (wbdL) has a deduced amino 
acid sequence with low homology with Lsi2 of Neisseria 
gonorrhoeae. Lsi2 is responsible for adding GlcNAc to 
galactose in the synthesis of lipooligosaccharide . Thus 
it is likely that wbdL is either a colitose or glucose 
transferase gene. orfll (wbdM) shares high level 
nucleotide and amino acid similarity with TrsE of 
Yersinia enterocolitica . TrsE is a putative sugar 
transferase thus it is likely that wbdM encodes the 
colitose or glucose transferase. 

[00207] In summary three putative transferase 
genes and an 0 antigen polymerase gene were identified at 
map position 1 to 3,020 and 9,982 to 14,516 of E. coli 
0111 O antigen gene cluster. A search of GenBank has 
shown that there are no genes with significant similarity 
at the nucleotide sequence level for two of the three 
putative transferase genes or the polymerase gene. 
Figure 5 provides the nucleotide sequence of the 0111 
antigen gene cluster. 

Materials and Methods-part 3 

A. PGR amplification of 0157 antigen gene cluster from 
an E. coli 0157 :H7 strain (Strain G664-1992, from Statens 
Serum Institut, 5 Artillerive j , 2300, Copenhagen S, 
Denmark) : 

[00208] E. coli 0157 O antigen gene cluster was 
amplified by using long PGR [Cheng et al . 1994, 
"Effective amplification of long targets from cloned 
inserts and human and genomic DNA" P.N.A.S. USA 91: 5695- 
569] with one primer (primer #412 (SEQ ID NO: 91) : att ggt 
age tgt aag cca agg gcg gta gcg t) based on the JumpStart 
sequence usually found in the promoter region of O 
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antigen gene clusters [Hobbs, et ai. 1994 "The JumpStart 
sequence: a 3 9 bp element common to several 
polysaccharide gene clusters" Mol. Microbiol. 12: 855- 
856] , and another primer #482 (cac tgc cat acc gac gac 
gcc gat ctg ttg ctt gg) (SEQ ID NO: 92) based on the gnd 
gene usually found downstream of the O antigen gene 
cluster. Long PGR was carried out using the Expand Long 
Template PGR System from Boehringer Mannheim (Gastle Hill 
NSW Australia), and products, 14kb in length, from 
several reactions were combined and purified using the 
Promega Wizard PGR preps DNA purification System (Madison 
WI USA) . The PGR product was then extracted with phenol 
and twice with ether, precipitated with 70% ethanol, and 
resuspended in 40mL of water. 
B. Construction of a random DNase I bank: 

[00209] Two aliquots containing about 150ng of DNA 
each were subjected to DNase I digestion using the 
Novagen DNase I Shotgun Gleavage (Madison WI USA) with a 
modified protocol as described. Each aliquot was diluted 
into 45ml of 0.05M Tris -HCl (pH7-5), 0.05mg/mL BSA and 
lOmM MnGl2. 5mL of 1:3000 or 1:4500 dilution of DNasel 
(Novagen) (Madison WI USA) in the same buffer was added 
into each tube respectively and 10ml of stop buffer 
(lOOmM EDTA) , 30% glycerol, 0.5% Orange G, 0.075% xylene 
and cyanol (Novagen) (Madison WI USA) was added after 
incubation at 15''G for 5 min. The DNA from the two DNasel 
reaction tubes were then combined and fractionated on a 
0.8% LMT agarose gel, and the gel segment with DNA of 
about Ikb in size (about 1 . 5mL agarose) was excised. DNA 
was extracted from agarose using Promega Wizard PGR Preps 
DNA Purification (Madison WI USA) and resuspended in 200 
mL water, before being extracted with phenol and twice 
with ether, and precipitated- The DNA was then 
resuspended in 17.25 mL water and subjected to T4 DNA 
polymerase repair and single dA tailing using the Novagen 
Single dA Tailing Kit (Madison WI USA) . The reaction 
product (8 5ml containing about 8ng DNA) was then 
extracted with chloroform: isoamyl alcohol (24:1) once and 
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ligated to 3x 10"^ pmol pGEM-T (Promega) (Madison WI USA) 
in a total volume of lOOmL. Ligation was carried out 
overnight at 4^C and the ligated DNA was precipitated and 
resuspended in 20mL water before being electroporated 
into E. coli strain JM109 and plated out on BCIG-IPTG 
plates to give a bank. 

C. Sequencing 

[00210] DNA templates from clones of the bank were 
prepared for sequencing using the 96-well format plasmid 
DNA miniprep kit from Advanced Genetic Technologies Corp 
(Gaithersburg MD USA) . The inserts of these clones were 
sequenced from one or both ends using the standard M13 
sequencing primer sites located in the pGEM-T vector. 
Sequencing was carried out on an ABI377 automated 
sequencer {CA USA) as described above, after carrying out 
the sequencing reaction on an ABI Catalyst (CA USA) . 
Sequence gaps and areas of inadequate coverage were PGR 
amplified directly from 0157 chromosomal DNA using 
primers based on the already obtained sequencing data and 
sequenced using the standard M13 sequencing primer sites 
attached to the PGR primers. 

D. Analysis of the E, coli 0157 O antigen gene cluster 
nucleotide sequence data 

[00211] Sequence data were processed and analysed 
using the Staden programs [Staden, R. , 1982 "Automation 
of the computer handling of gel reading data produced by 
the shotgun method of DNA sequencing," Nuc, Acid Res. 10: 
4731-4751; Staden, R. , 1986 "The current status and 
portability of our sequence handling software". Nuc. Acid 
Res. 14: 217-231; Staden, R. 1982 "An interactive 
graphics program for comparing and aligning nucleic acid 
and amino acid sequence". Nuc, Acid Res. 10: 2951-2961]. 

Figure 4 shows the structure of E. coli 0157 O antigen 
gene cluster. Twelve open reading frames were predicted 
from the sequence data, and the nucleotide and amino acid 
sequences of all these genes were then used to search the 
GenBank database for indication of possible function and 
specificity of these genes. The position of each gene is 
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listed in Table 9. The nucleotide sequence is presented 
in Figure 6, 

[00212] orjfs 10 and 11 showed high level identity 
to manC and laanB and were named manC and manB 
respect i\rely. orfl showed 8 9% identity (at amino acid 
level) to the gmd gene of the E. coli colanic acid 
capsule gene cluster (Stevenson G., K. et al. 1996 
"Organisation of the Escherichia coli K-12 gene cluster 
responsible for production of the extracellular 
polysaccharide colanic acid". J. Bacterid. 178:4885-4893) 
and was named gmd, orfS showed 7 9% and 69% identity (at 
amino acid level) respectively to wcaG of the E, coli 
colanic acid capsule gene cluster and to wbcJ (orfl4.8) 
gene of the Yersinia enterocolitica 08 O antigen gene 
cluster (Zhang, L. et al. 1997 "Molecular and chemical 
characterization of the lipopolysaccharide 0-antigen and 
its role in the virulence of Y. ant&rocolitica serotype 
08".Mol. Microbiol. 23:63-76). Colanic acid and the 
Yersinia 08 O antigen both contain fucose as does the 
0157 O antigen. There are two enzymatic steps required 
for GDP-L-fucose synthesis from GDP-4-keto-6-deoxy-D" 
mannose, the product of the gmd gene product- However, 
it has been shown recently (Tonetti, M et al- 1996 
Synthesis of GDP-L-fucose by the human FX protein J, 
Biol. Chem. 271:27274-27279) that the human FX protein 
has "significant homology" with the wcaG gene (referred 
to as Yefh in that paper) , and that the FX protein 
carries out both reactions to convert GDP-4-keto-6-deoxy- 
D-mannose to GDP-L-fucose. We believe that this makes a 
very strong case for orf8 carrying out these two steps 
and propose to name the gene fcl. In support of the one 
enzyme carrying out both functions is the observation 
that there are no genes other than manB^ manC, gmd and 
fcl with similar levels of similarity between the three 
bacterial gene clusters for fucose containing structures . 

[00213] orf"5 is very similar to v/beE (rfbE) of 
Vibrio cholerae 01, which is thought to be the perosamine 
synthetase, which converts GDP-4-keto-6-deoxy-D-mannose 
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to GDP-perosamine (Stroeher, U.H et ai . 1995 "A putative 
pathway for perosamine biosynthesis is the first function 
encoded within the rfh region of Vibrio cholerae" 01. 
Gene 166: 33-42). V. cholerae 01 and E. coli 0157 O 
antigens contain perosamine and N-acetyl-perosamine 
respectively- The V. cholerae 01 manA, manB, gmd and 
wbeE genes are the only genes of the V. cholerae 01 gene 
cluster with significant similarity to genes of the E. 
coli 0157 gene cluster and we believe that our 
observations both confirm the prediction made for the 
function of wbe at V, cholerae, and show that orjf5 of the 
0157 gene cluster encodes GDP-perosamine synthetase, 
orf5 is therefore named per. orf5 plus about lOObp of 
the upstream region (postion 4022-5308 ) was previously 
sequenced by Bilge, S.S. et al- [1996 "Role of the 
Escherichia coli 0157-H7 O side chain in adherence and 
analysis of an rfb locus" . Infect • Immun. 64:4795-4801]. 

[00214] orfl2 shows high level similarity to the 
conserved region of about 50 amino acids of various 
members of an acetyltransf erase family (Lin, W., et al. 
1994 "Sequence analysis and molecular characterisation of 
genes required for the biosynthesis of type 1 capsular 
polysaccharide in Staphylococcus aureus". J. Bateriol. 
176: 7005-7016) and we believe it is the N- 
acetyltransf erase to convert GDP-perosamine to GDP- 
perNAc. orfl2 has been named wbdR^ 

[00215] The genes manB, manC, gmd, fcl, per and 
wbdR account for all of the expected biosynthetic pathway 
genes of the 0157 gene cluster. 

[00216] The remaining biosynthetic step(s) 
required are for synthesis of UDP-GalNAc from UDP-Glc. 
It has been proposed (Zhang, L., et al. 1997 "Molecular 
and chemical characterisation of the lipopolysaccharide 
0-antigen and its role in the virulence of Yersinia 
enterocolitica serotype 08". Mol . Microbiol. 23:63-76) 
that in Yersinia enterocolitica UDP-GalNAc is synthesised 
from UDP-GlcNAc by a homologue of galactose epimerase 
(GalE) , for which there is a galE like gene in the 
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Yersinia Bnterocolitica 08 gene cluster. In the case of 
0157 there is no galE homologue in the gene cluster and 
it is not clear how UDP-GalNAc is synthesised. It is 
possible that the galactose epimerase encoded by the galE 
gene in the gal operon, can carry out conversion of UDP- 
GlcNAc to UDP-GalNAc in addition to conversion of UDP-Glc 
to UDP-Gal. There do not appear to be any gene(s) 
responsible for UDP-GalNAc synthesis in the 0157 gene 
cluster . 

[00217] orf4 shows similarity to many wzx genes 
and is named wzx and orf2 which shows similarity of 
secondary structure in the predicted protein to other wzy 
genes and is for that reason named wzy. 

[00218] The orfl, orf3 and orf6 gene products all 
have characteristics of transferases, and have been named 
whdN, whdO and whdP respectively. The 0157 O antigen has 
4 sugars and 4 transferases are expected. The first 
transferase to act would put a sugar phosphate onto 
undecaprenol phosphate. The two transferases known to 
perform this function, WbaP (RfbP) and WecA (Rfe) 
transfer galactose phosphate and N-acetyl-glucosamine 
phosphate respectively to undecaprenol phosphate. 
Neither of these sugars is present in the 0157 structure. 

[00219] Further, none of the presumptive 
transferases in the 0157 gene cluster has the 
transmembrane segments found in WecA and WbaP which 
transfer a sugar phosphate to undecaprenol phosphate and 
expected for any protein which transferred a sugar to 
undecaprenol phosphate which is embedded within the 
membrane . 

[00220] The WecA gene which transfers GlcNAc-P to 
undecaprenol phosphate is located in the Enterobactereal 
Common Antigen (EGA) gene cluster and it functions in EGA 
synthesis in most and perhaps all E. coli strains, and 
also in O antigen synthesis for those strains which have 
GlcNAc as the first sugar in the O unit . 

[00221] It appears that WecA acts as the 
transferase for addition of GalNAc-l-P to undecaprenol 



-64- 



U.S. Serial No- 09/701,132 

PROVISIONAL ELECTION AND PRELIMINARY AMENDMENT 
AMENDMENTS TO THE SPECIFICATION 



phosphate for the Yersinia enterocolitica 08 O antigen 
[Zhang et al.l997 "Molecular and chemical 

characterisation of the lipopolysaccharide O antigen and 
its role in the virulence of Yersinia enterocolitica 
serotype 08" Mol - Microbiol. 23: 63-76.] and perhaps does 
so here as the 0157 structure includes GalNAc. WecA has 
also been reported to add Glucose-l-P phosphate to 
undecaprenol phosphate in E. coli 08 and 09 strains, and 
an alternative possibility for transfer of the first 
sugar to undecaprenol phosphate is WecA mediated transfer 
of glucose, as there is a glucose residue in the 0157 O 
antigen. In either case the requisite number of 
transferase genes are present if GalNAc or Glc is 
transferred by WecA and the side chain Glc is transferred 
by a transferase outside of the 0 antigen gene cluster. 

[00222] orf9 shows high level similarity (44% 
identity at amino acid level, same length) with wcaH gene 
of the E. coli colanic acid capsule gene cluster. The 
function of this gene is unknown, and we give orf9 the 
name wbdQ. 

[00223] The DNA between manB and wdbR has strong 
sequence similarity to one of the H-repeat units of E. 
coli K12. Both of the inverted repeat sequences flanking 
this region are still recognisable, each with two of the 
11 bases being changed. The H-repeat associated protein 
encoding gene located within this region has a 267 base 
deletion and mutations in various positions. It seems 
that the H-repeat unit has been associated with this gene 
cluster for a long period of time since it translocated 
to the gene cluster, perhaps playing a role in assembly 
of the gene cluster as has been proposed in other cases. 

Materials and Methods - part 4 

[00224] To test our hypothesis that 0 antigen 
genes for transferases and the wzx^ wzy genes were more 
specific than pathway genes for diagnostic PGR, we first 
carried out PGR using primers for all the E. coli 016 O 
antigen genes (Table 7) . The PGR was then carried out 
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using PGR primers for E.coll 0111 transferase, wzx and 
wzy genes (Table 8, 8A) • PGR was also carried out using 
PGR primers for the E. coll 0157 transferase, wzx and wzy 
genes (Table 9, 9A) . 

[00225] Ghromosomal DNA from the 166 serotypes of 
E. coll available from Statens Serum Institut, 5 
Artillerive j , 2300 Gopenhagen Denmark was isolated using 
the Promega Genomic (Madison WI USA) isolation kit. Note 
that 164 of the serogroups are described by Ewing W. H. : 
Edwards and Ewings "Identification of the 

Enterobacteriacea" Elsevier, Amsterdam 1986 and that they 
are numbered 1-171 with numbers 31, 47, 67, 72, 93, 94 
and 122 no longer valid. Of the two serogroup 19 strains 
we used 19ab strain F8188-41. Lior H. 1994 
["Glassif ication of Escherichia coli In Escherichia coli 
in domestic animals and humans pp 31-72. Edited by G.L. 
Gyles GAB international] adds two more numbered 172 and 
173 to give the 166 serogroups used. Pools containing 5 
to 8 samples of DNA per pool were made. Pool numbers 1 
to 19 (Table 4) were used in the E. coli 0111 and 0157 
assay. Pool numbers 20 to 28 were also used in the 0111 
assay, and pool numbers 22 to 24 contained E. coli 0111 
DNA and were used as positive controls (Table 5) . Pool 
numbers 29 to 42 were also used in the 0157 assay, and 
pool numbers 31 to 36 contained E. coli 0157 DNA, and 
were used as positive controls (Table 6) . Pool numbers 2 
to 20, 30, 43 and 44 were used in the E. coli 016 assay 
(Tables 4 to 6) . Pool number 44 contained DNA of E. coli 
K-12 strains G600 and WGl and was used as a positive 
control as between them they have all of the E. coli K-12 
016 O antigen genes. 

[00226] PGR reactions were carried out under the 
following conditions: denaturing 94''G/30"; annealing, 
temperature varies (refer to Tables) /30"; extension, 
72^^0/1'; 30 cycles. PGR reaction was carried out in a 
volume of 25mL for each pool. After the PGR reaction, 
lOmL PGR product from each pool was run on an agarose gel 
to check for amplified DNA. 
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[00227] Each E. coli chromosomal DNA sample was 
checked by gel electrophoresis for the presence of 
chromosomal DNA and by PGR amplification of the E. coli 
mdh gene using oligonucleotides based on E. coli K-12 
[Boyd et al. (1994) "Molecular genetic basis of allelic 
polymorphism in malate degydrogenase {mdh) in natural 
populations of Escherichia coli and Sa Imon el la enterica " 
Proc- Nat- Acad. Sci. USA. 91:1280-1284.] Chromosomal 
DNA samples from other bacteria were only checked by gel 
electrophoresis of chromosomal DNA. 

A. Primers based on E. coli 016 O antigen gene cluster 
sequence . 

[00228] The O antigen gene cluster of E, coli 016 
was the only typical E. coli O antigen gene cluster that 
had been fully sequenced prior to that of 0111, and we 
chose it for testing our hypothesis. One pair of primers 
for each gene was tested against pools 2 to 20, 30 and 43 
of E. coli chromosomal DNA. The primers, annealing 
temperatures and functional information for each gene are 
listed in Table 8. 

[00229] For the five pathway genes, there were 
17/21, 13/21, 0/21, 0/21, 0/21 positive pools for rjnIB, 
rmlD, rmlA, rmlC and glf respectively (Table 7). For the 
wzXf wzy and three transferase genes there were no 
positives amongst the 21 pools of E. coli chromosomal DNA 
tested (Table 7). In each case the #44 pool gave a 
positive result. 

B. Primers based on the E> coli 0111 O antigen gene 
cluster sequence. 

[00230] One to four pairs of primers for each of 
the transferase, wzx and wzy genes of 0111 were tested 
against the pools 1 to 21 of E. coli chromosomal DNA 
(Table 8) . For wbdH, four pairs of primers, which bind 
to various regions of this gene, were tested and found to 
be specific for 0111 as there was no amplified DNA of the 
correct size in any of those 21 pools of E. coli 
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chromosomal DNA tested. Three pairs of primers for wbdM 
were tested, and they are all specific although primers 
#985/#986 produced a band of the wrong size from one 
pool. Three pairs of primers for wzx were tested and 
they all were specific. Two pairs of primers were tested 
for wzy, both are specific although #980/#983 gave a band 
of the wrong size in all pools. One pair of primers for 
wbdL was tested and found unspecific and therefore no 
further test was carried out. Thus, wzxjr wzy and two of 
the three transferase genes are highly specific to 0111- 

Bands of the wrong size found in amplified DNA are 
assumed to be due to chance hybridisation of genes widely 
present in E. coli. The primers, annealing temperatures 
and positions for each gene are in Table 8. 

[00231] The 0111 assay was also performed using 
pools including DNA from O antigen expressing Yersinia 
pseudotuhorculosis f Shigella hoydii and Salmonella 
enterica strains (Table 8A) . None of the 

oligonucleotides derived from wbdH^ wzk, wzy or wbdM gave 
amplified DNA of the correct size with these pools. 
Notably, pool number 25 includes S. enterica Adelaide 
which has the same O antigen as E. coli 0111: this pool 
did not give a positive PGR result for any primers tested 
indicating that these genes are highly specific for E. 
coli 0111. 

[00232] Each of the 12 pairs binding to wbdH^ wzx, 
wzy and wbdM produces a band of predicted size with the 
pools containing 0111 DNA (pools number 22 to 24) . As 
pools 22 to 24 included DNA from all strains present in 
pool 21 plus 0111 strain DNA (Table 5), we conclude that 
the 12 pairs of primers all give a positive PGR test with 
each of three unrelated 0111 strains but not with any 
other strains tested. Thus these genes are highly 
specific for E. coli 0111. 

C. Primers based on the E, coli 0157 O antigen gene 
cluster sequence. 

[00233] Two or three primer pairs for each of the 
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transferase, w^x and p/zy genes of 0157 were tested 
against E. coll chromosomal DNA of pools 1 to 19, 29 and 
30 (Table 9) . For whdN, three pairs of primers, which 
bind to various regions of this gene, were tested and 
found to be specific for 0157 as there was no amplified 
DNA in any of those 21 pools of E. coli chromosomal DNA 
tested. Three pairs of primers for wbdO were tested, and 
they are all specific although primers # 1211/#1212 
produced two or three bands of the wrong size from all 
pools. Three pairs of primers were tested for wbdP and 
they all were specific. Two pairs of primers were tested 
for whdR and they were all specific. For wzy, three 
pairs of primers were tested and all were specific 
although primer pair #1203/#1204 produced one or three 
bands of the wrong size in each pool. For wzx, two pairs 
of primers were tested and both were specific although 
primer pair #1217/#1218 produced 2 bands of wrong size in 
2 pools, and 1 band of wrong size in 7 pools. Bands of 
the wrong size found in amplified DNA are assumed to be 
due to chance hybridisation of genes widely present in E. 
coli. The primers, annealing temperatures and function 
information for each gene are in Table 9. 

[00234] The 0157 assay was also performed using 
pools 37 to 42, including DNA from O antigen expressing 
Yersinia pseudotuberculosis, Shigella boydii, Yersinia 
enterocolitica 09, Brucella abortus and Salmonella 
enterica strains (Table 9A) . None of the 

oligonucleotides derived from wbdN, wzy, wbdO, wzx, wbdP 
or wbdR reacted specifically with these pools, except 
that primer pair #1203/#1204 produced two bands with Y. 
enterocolitica 09 and one of the bands is of the same 
size with that from the positive control. Primer pair 
#1203/#1204 binds to wzy. The predicted secondary 
structures of Wzy proteins are generally similar, 
although there is very low similarity at amino acid or 
DNA level among the sequenced wzy genes. Thus, it is 
possible that Y. enterocolitica 09 has a wzy gene closely 
related to that of E. coli 0157. It is also possible 
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that this band is due to chance hybridization of another 
gene, as the other two ivzy primer pairs (#1205/#1206 and 
#1207/#1208) did not produce any band with Y, 
enterocolitlca 09. Notably, pool number 37 includes S. 
enterica Landau which has the same O antigen as E, coll 
0157, and pool 38 and 39 contain DNA of B. abortus and Y. 
enterocolitlca 09 which cross react serologically with E, 
coll 0157. This result indicates that these genes are 
highly 0157 specific, although one primer pair may have 
cross reacted with Y. enterocolitlca 09. 

[00235] Each of the 16 pairs binding to whdN, wzk, 
wzYr wlDdO^ wbdP and wbdR produces a band of predicted 
size with the pools containing 0157 DNA (pools number 31 
to 36) . As pool 29 included DNA from all strains present 
in pools 31 to 36 other than 0157 strain DNA (Table 6) , 
we conclude that the 16 pairs of primers all give a 
positive PGR test with each of the five unrelated 0157 
strains • 

[00236] Thus, PGR using primers based on genes 
wbdN, wzy, wbdO, wzx, wbdP and wbdR is highly specific 
for E, coll 0157, giving positive results with each of 
six unrelated 0157 strains while only one primer pair 
gave a band of the expected size with one of three 
strains with O antigens known to cross-react 
serologically with E, coll 0157 . 
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TABLE 1 

H7 s-traxns used in t:his work in addi-bion -bo the H 
antigens type strains 



Nax&e used 


Ser o type 




Souri 


in this 




name 




study 








M527 


0157 :H7 


C664-1992 


a 


M917 


018ac:H7 


A57 


I MVS 


M918 


018ac:H7 


A62 


I MVS 


M973 


02 :H7 


A1107 


CDC 


M1004 


0157 :H7 


EH7 


b 


M1179 


018ac:H7 


D-M3291/54 


I MVS 


M1200 


07:H7 


A64 


c 


M1211 


019ab:H7 


F8188-41 


I MVS 


M1328 


053:H7 


14097 


IMVS 


M1686 


055 :H7 


TB156 


d 



-k 

a . 



c . 



d. 

IMVS, 
CDC, 



Statens Serum Institut, Copenhagen, Denmark. 

Dr R. Brown of Royal Children's Hospital, Melbourne, 
Australia . 

Max-Planck Institut fur molekulare Genetik, Berlin, 
Germany . 

Dr P. Tarr of Children's Hospital and Medical Center, 
University of Washington, USA. 

Institute of Medical and veterinary Science, Adelaide, 
Australia . 

Centers for Disease Control and prevention, Atlanta, 
USA, 
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Table 2 



Oligonucleotides used to PGR amplify f/iC genes 
from different H tvoe strains for seauencina 


H Type 
Strains 


Annealing 
TerriDerature (°C) 


Primers Used 


SEQ ID NO 


1 


55 


#1575/#1576 


83/84 








77/70 


3 


55 


#1285/#1286 


77/78 


4 


50 


#1431^1432 


81/82 


5 


60 


#1285/#1286 


77/78 


6 


55 


#1575/#1 576 


83/84 


7 


55 


#1575/#1576 


83/84 


8 


55 


#1431/#T432 




9 


60 


#1575/#1676 


83/84 


10 


55 


#1575/#1576 


83/84 


11 


55 


#1285/#1286 


77/76 


12 


60 


#1575/#1576 


83/84 


14 


60 


#1575/#1576 


83/84 


15 


60 


#1575/#1576 


83/84 


16 


60 


#1575/#1576 


83/84 


17 


60 


#1417^1418 


79/80 


18 


60 


#1575/#1576 


83/84 


19 


60 


#1575/#1576 


83/84, 


20 


60 


#1575/#1576 


63/84 


21 


55 


#1285/#1286 


77/78 


23 


60 


#1575/#1576 


83/8^ 


24 


60 


#1285.^1286 


77/78 


25 


60 


#1417/#1418 


79/80 


26 


60 


#1575^1576 


83/84 


27 


50 


#1431/#1432 


81/82 


28 


60 


#1575/#1576 


83/84 


29 


60 


#1285^1286 


77/78 


30 


60 


#1575/#1576 


83/84 


31 


60 


#1575/#1576 




32 


60 


#1575/#1576 


63/6^ 


33 


60 


#1285/1286 


77/7^ 


34 


55 


#1575/#1576 


83/84 


35 


50 


#1431/#1432 


81 /S2 


37 


60 


#1285/#1286 


77/78 


38 


60 


#1285^1286 


77/78 


39 


55 


#1285^1286 


77/78 


40 


55 


#1285y#1286 


77_/za 


41 


60 


#1575/#1576 


83/84 


42 


60 


#1285^1286 


77/za 


43 


60 


#1575/#1576 


83JM 


AA 


Rn 
dU 


ffl/tSO/ffI ZOO 


llLUi. 


45 


60 


#1575/#1576 


83/84 


46 


60 


#1575/#1576 


83/84 


47 


55 


#1285/#1286 


77/78 


48 


60 


#1575/#1576 


83/84 


49 


60 


#1575^1576 


83/84 


50 


60 


#1285/#1286 


77/78 


51 


60 


#1575/#1576 


83/84 


52 


60 


#1575^;^1576 


83/84 


54 


50 


#1431^1432 


81/82 


55 


60 


#1285/#1286 


77/78 


56 


60 


#1285/#1286 


77/78 



-72- 



U.S. Serial No, 09/701,132 

PROVISIONAL ELECTION AND PRELIMINARY AMENDMENT 
AMENDMENTS TO THE SPECIFICATION 



Table 3 

Summary of the flagellin sequences obtained and specific H type 



oiiaonucleotide orim 


ers 


Htype 
strain(s) the 
sequenced 
gene(s) 
obtained from 


vSEO ID 

NO: 


H specificity 
coded by 
the gene(s) 


H type strain from 
which the fiageilin gene 
sequence was used for 
primer choice 


Positions of 
primer 1 


Positions of 
primer 2 


1 


66 


1 


1 


892-909 


1172-1189 


2 


&Z 


2 


2 


568-587 


1039-1056 


4,17.44 


6.17.42 


4 


4 


466-483 


628-648 


5 


Z 


5 


5 


697-714 


877-897 


6 




6 


6 


565-585 


799-816 


7 




7 


7 


553-570 
(Drimer #1806) 


1483-1500 
(orimer #1809) 


9 


11 


9 


9 


616-633 


838-855 


10f50)*** 


12^49^ 


10 


10 


559-579 


697-717 


11 


13 


11 


11 


586-606* 


791-810* 


12 


14 


12 


12 


892-909 


1172-1189 


14 


15 


14 


14 


586-606 


793-813 


15 




15 


15 


640-660 


817-834 


3 




16 


3 


649-666 


925-942 


18 




18 


18 


589-606 


802-819 


19 


IZ 


19 


19 


607-624 


538-855 


on 

£.\J 


2Q 


20 


20 


574-591 


760-780 


21.47 


21.46 


21 


21 


676-693** 


862-879** 


23 


22 


23 


23 


637-654 


1336-1353 


24 


23 


24 


24 


496-516 


772-792 


26 


25 


26 


26 


553-570 


772-789 


27 


2& 


27 


27 


685-702 


799-819 


28 


2Z 


28 


28 


592-609 


778-798 


29 


m 


29 


29 


538-555 


757-774 


30 


22 


30 


30 


814-831 


943-962 


31 




31 


31 


571-588 


790-807 


32 




32 


32 


514-831 


1057-1074 


33 


22 


33 


33 


553-570 


718-735 


34 


33 


34 


34 


568-585 


796-816 


38.55 


36.53 


38 


38 


553-573 


709-729 


39 


2Z 


39 


39 


556-573 


718-735 


41 




41 


41 


598-615 


784-801 


42 




42 


42 


547-567 


715-735 


43 




43 


43 


580-597 


844-861 


45 




45 


45 


640-657 


943-963 


46 




46 


46 


565-582 


781-801 


49 




49 


49 


589-609 


754-771 


51 




61 


51 


565-582 


1042-1059 


52 


5.1, 


52 


52 


598-615 


829-846 


56 


54 


56 


56 


697-714 


877-897 


8 and 40 


10 and 38 




8 


562-579 


1045-1062 


25 


2d 




25 


529-549 


703-723 


35 


3d 




non-functional H11 aene 


769-789* 


1045-1065* 


37 


35 




37 


520-537 


715-735 


48 


dZ 




48 


568-585 


835-852 


54 


52 




non-functional H21 aene 


988-1008** 


1344-1364** 



See section 13 for choice of primers for the flagellin gene of H11 
See section 13 for choice of primers for the flagellin gene of H21 
See text 
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Table 3A 



H type 
strain from 
wnicn the H 

antigen 


Primers used 

for PGR 
amplification of 
the H antigen 
gene 


SEO ID 

mi 


Annealing 
temperature 
(oC) used for 

PGR 
amplification 


Plasmid 
carrying the 
H antigen 
gene 


Host strain 

used for 
expression 


Anti-serum 

which 
reacts with 
an E. Coli 

mc 

deletion 

strain 
carrying 

the 
plasmid 


H 

antigen 
encode 
d by the 
cloned 
gene 


H1 


#1868 & #1870 


69&71 


55 


DPR1920 


M2126 


HI 


HI 


H2 


#1868 & #1870 


69&71 


55 


DPR1977 


P5560 


H2 


H2 


H3 


#1868 & #1870 


69&71 


55 


DPR1969 


P5560 


H16 


H16 


H4 


#1878 & #1885 


74&76 


65 


DPR1955 


P5560 


H4 


H4 


H5 


#1868 & #1870 


69& 71 


60 


DPR1967 


M2126 


H5 


H5 


H6 


#1868 & #1870 


69&71 


55 


DPR1921 


P5560 


H6 


H6 


H7 


#1868 8t #1870 


69&71 


55 


DPR1919 


P5560 


H7 


H7 


H9 


#1868 & #1870 


69&71 


55 


DPR1922 


P5560 


H9 


H9 


H10 


#1868 & #1870 


69&71 


55 


DPR1923 


P5560 


H10 


H10 


H11 


#1868 & #1870 


69 & 71 


55 


DPR1981 


M21 26 


H11 


H11 


H12 


#1868 & #1870 


69&71 


60 


DPR1990 


M2126 


H12 


H12 


H14 


#1868 & #1870 


RQ&71 


55 


DPR1924 


P5560 


H14 


H14 


H15 


#1868 & #1870 


69&71 


55 


DPR1925 


P5560 


H15 


H15 


H17 


#1878 & #1885 


74&76 


65 


DPR1957 


P5560 


H4 


H4 


H18 


#1 868 & #1870 


69&71 


55 


DPR1986 


M2126 


H18 


H18 


H19 


#1868 & #1870 


69&71 


55 


0PR1927 


P5560 


H19 


H19 


H20 


#1868 & #1870 


69& 71 


55 


DPR1963 


M2126 


H20 


H20 


H21 


#1868 & #1870 


69&71 


55 


DPR1995 


M2126 


H21 


H21 


H23 


#1868 & #1869 


69&70 


55 


DPR1942 


P5560 


H23 


H23 


H24 


#1868 & #1870 


69&71 


55 


DPR1971 


M2126 


H24 


H24 


H26 


#1868 & #1870 


69&71 


65 


DPR1928 


P5560 


H26 


H26 


H27 


#1868 & #1870 


69&71 


55 


DPR1970 


M2126 


H27 


H27 


H28 


#1868 & #1870 


69&71 


60 


DPR1944 


P5660 


H28 


H28 


H29 


#1868 & #1870 


69&71 


55 


DPR1972 


M2126 


H29 


H29 


H30 


#1868 & #1871 


69&72 


55 


DPR1948 


P5560 


H30 


H30 


H31 


#1868 & #1870 


69&71 


65 


DPR1965 


M2126 


H31 


H31 


H32 


#1868 & #1871 


69&72 


55 


DPR1940 


P5560 


H32 


H32 


H33 


#1868 & #1871 


69&72 


55 


DPR1976 


M2126 


H33 


H33 


H34 


#1868 & #1870 


sa-a 71 


65 


PPR1930 


P5560 


H34 


H34 


H38 


#1868 & #1870 


69&71 


48 


DPR1984 


M2126 


H38 


H38 


H39 


#1868 & #1870 


69&71 


48 


DPR1982 


M2126 


H39 


H39 


H41 


#1868 & #1870 


68&71 


65 


DPR1931 


P5560 


H41 


H41 


H42 


#1868 & #1870 


m 71 


50 


DPR1979 


M2126 


H42 


H42 


H43 


#1868 & #1870 


69*71 


65 


DPR1968 


M2126 


H43 


H43 


H45 


#1868 & #1870 


69&71 


60 


DPR1943 


P5560 


H45 


H45 


H46 


#1868 & #1870 


69&71 


60 


DPR 1966 


M2126 


H46 


H46 


H49 


#1868 & #1870 


6S&71 


60 


DPR1985 


M2126 


H49 


H49 


H51 


#1868 & #1870 


68&71 


65 


DPR1941 


P5560 


H51 


H51 


H52 


#1868 & #1870 


69& 71 


65 


DPR1935 


P5560 


H52 


H52 


H56 


#1868 & #1870 


69&71 


50 


DPR1978 


M2126 


H56 


H56 
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Table 3B 

Oligonucleotide primers used for PGR amplification and cloning of 

H antigen genes 



#1868 


SEO ID NO-69 


5' - cat gen atg gca caa Qtc att aat acc -3' 
A/col 


#1 869 


SEO in WO:70 


5' - ata tat caa ctt aac cct aca aca aaa aca a -3' 
Sail 


#1870 


<5EO ID NO!71 


5' - atg gat taa ccc tgc age aga gac aa -3' 
BamH\ 


#1871 


SEO ID NO:72 


5" - aac tgc agt taa ccc tgt age aaa aac aa -3' 
Pst\ 


#1872 


SEO ID NO:73 


5' - egg gat ccc gca gac tgg ttc ttg ttg at -3* 
BamHI 


#1878 


55EO ID NO:74 


5' - egg gat cea ctt eta teg age gee tct ct -3' 
BamH! 


#1884 


SEO ID rMO:75 


5' - geLcta_gag ego aga tea ttc age agg cc -3' 
Xbal 


#1885 


SEO ID NO:76 


5' - get_ctaLgac atg ttg gac act teg gtc gc -3' 
Xbal 
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TABLE 4 



Pool Strains of which chromosonal DNA included in the pool Source* 
No. 

1 E. coli type strains for O serotypes 1, 2, 3, 4, 10, 16, 18 and 39 IMVS» 

2 £. coli type strains for O serotypes 40, 41, 48, 49, 71, 73, 88 and 100 IMVS 

3 E. coli type strains for O serotypes 102, 109, 119, 120, 121, 125, 126 and IMVS 
137 

4 E. coli type strains for O serotypes 138, 139, 149, 7, 5, 6, 11 and 12 IMVS 

5 E. coli type strains for O serotypes 13, 14, 15, 17, 19ab, 20, 21 and 22 IMVS 

6 E. coli type strains for O serotypes 23, 24, 25, 26, 27, 28, 29 and 30 IMVS 

7 E. coli type strains for O serotypes 32, 33, 34, 35, 36, 37, 38 and 42 IMVS 

8 E.co/i type strains for O serotypes 43, 44, 45, 46, 50, 51, 52 and 53 IMVS 

9 E. coli type strains for O serotypes 54, 55, 56, 57, 58, 59, 60 and 61 IMVS 

10 E. coli type strains for O serotypes 62, 63, 64, 65, 66, 68, 69 and 70 IMVS 

11 E. coli type strains for O serotypes 74, 75, 76, 77, 78, 79, 80 and 81 IMVS 

12 E. coli type strains for O serotypes 82, 83, 84, 85, 86, 87, 89 and 90 IMVS 

13 E. coli type strains for O serotypes 91, 92, 95, 96, 97, 98, 99 and 101 IMVS 

14 E. coli type strains for O serotypes 103, 104, 105, 106, 107, 108 and 110 IMVS 

15 E. coZi type strains for O serotypes 112, 162, 113, 114, 115, 116, 117 and IMVS 
118 

16 E. coli type straiite for O serotypes 123, 165, 166, 167, 168, 169, 170 and See b 
171 

17 E. coli type strains for O serotypes 172, 173, 127, 128, 129, 130, 131 and See c 
132 

18 £. coli type strains for O serotypes 133, 134, 135, 136, 140, 141, 142 and IMVS 
143 

19 E. coli type strains for O serotypes 144, 145, 146, 147, 148, 150, 151 and IMVS 
152 



a. Institute of Medical and Veterinary Science, Adelaide, Australia 

b. 123 fronn IMVS; the rest from Statens Serum Institute Copenhagen, Denmark 

c. 172 and 173 from Statens Serum Institut, Copenhagen, Denmark, the rest from 
IMVS 
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TABLE 5 



Pool 
No. 


Strains of which chromosonal DNA included in the pool 


Source* 


20 


E. coli type strains for O serotypes 153, 154, 155, 156, 157, 158 , 159 and 

160 


IMVS 


21 


£. coli type strains for O serotypes 161, 163, 164, 8, 9 and 124 


IMVS 


22 


As pool #21, plus £. cgU 0111 type strain Stoke \\\ 


IMVS 


23 


As pool #21, plus E. coli 0111:H2 strain C1250-1991 


See d 


24 


As pool #21, plus E. coli 0111:H12 strain C156-1989 


See e 


25 


As pool #21, plus S. enterica serovar Adelaide 


Seef 


26 


Y. pseudotuberculosis strains of O groups lA, IIA, HB, IIC, III, IVA, IVB, 
VA,VB,VIandVII 


Seeg 


27 


S. boydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 


See h 


28 


S. enterica strains of serovars (each representing a different O group) 
Typhi, Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, 
Dan, Dugbe, Basel, 65,:i:e,n,z,15 and 52:d:e,n,x,zl5 


IMVS 



d. C1250-1991 from Statens Serum Institut, Copenhagen, Denmark 

e. C156-1989 from Statens Serum Institut, Copenhagen, Denmark 

f. S. enterica serovar Adelaide from IMVS 

g. Dr S Aleksic of Institute of Hygiene, Germany 

h. Dr J Lefebvre of Bacterial Identification Section, Laboratoroie de Sante Publique 
du Quebec, Canada 
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TABLE 6 



Pool 
No. 


Strains of which chromosonal DNA included in the pool 


Source* 


29 


E. coli type strains for O serotypes 153, 154, 155, 156, 158, 159 and 160 


IMVS 


30 


E. coli type strains for O serotypes 161, 163, 164, 8, 9, 111 and 124 


IMVS 




pool TT^v , plus iL. Cull rype sirain /\.^ ^wiD/.XiX"^ 


IJVl VZ> 




/\s pool plus c. coil uiD/:rllo strain L.4/D-oy 


bee a 


33 


As pool #29, plus E. coli 0157:H45 strain C727-89 


See d 


34 


As pool #29, plus E. coli 0157-.H2 strain C252-94 


Seed 


35 


As pool #29, plus E. coli 0157:H39 strain C258-94 


Seed 


36 


As pool #29, plus E, coli 0157:H26 


See e 


37 


As pool #29, plus S. enlerica serovar Landau 


Seef 


38 


As pool #29, plus Brucella abortus 


Seeg 
Seeh 


39 


/TLO pUUl It^y, piUb J . EnlcYOCOllllCU KJ7 




40 


y. pseudotuberculosis strains of O groups lA, IIA, IIB, IIC, III, IVA, IVB, VA, 
VB, VI and VII 


Seei 


41 


S. boydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 


See j 


42 


S. enterica strains of serovars (each representing a different O group) Typhi, 
Montevideo, Ferruch, Jangwani, RauS/ Hvittingfoss, Waycross, Dan, Dugbe, 
Basel, 65:i:e,n,zl5 and 52:d:e,n,x,zl5 


IMVS 


43 


E. coli type strains for O serotypes 1,2,3,4,10,18 and 29 


IMVS 


44 


As pool #43, plus E. coli K-12 strains C600 and WGl 


IVMS 
Seek 



d. 0157 strains from Statens Serum Institut, Copenhagen, Denmark 

e. 0157:H26 from Dr R Brown of Royal Children's Hospital, Melbourne, Victoria 

f. S. enterica serovar Landau from Dr M Poppoff of Institut Pasteur, Paris, France 

g. B. Abortus from the culture collection of The University of Sydney, Sydney, Australia 
h- Y. enterocolitica 09 from Dr. K. Bettelheim of Victorian Infectious Diseases Reference 

Laboratory Victoria, Austraha. 
i, Dr S Aleksic of Institute of Hygiene, Germany 

J. Dr J Lefebvre of Bacterial Identification Section, Laboratoroie de Sante Publique du 
Quebec, Canada 

k. Strains C600 and WGl from Dr. B.J. Backmann of Department of Biology, Yale 
University, USA. 
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